Do you need to start recognizing and unlocking the value of your smart city data? Bob McQueen provides a timely guide to doing just that

There are a lot of things involved in smart cities. This worldwide initiative includes a huge variety of elements like smart energy grids, smart places to live and work, urban connectivity and mobility. In deciding to develop a smart city there are many challenges and opportunities that must be addressed. The question of coherence and structure is one of the most important. It is all too easy to develop a smart city program that consists of many standalone projects that all result in tangible benefits. While this will take the city forward towards smartness, there could be a missed opportunity to take advantage of the reinforcing effect of planning the projects so that they work together and take maximum advantage of commonality. This is also the way to get the best value for money for the investment in smart cities.


Considering the city as an internal combustion engine, and each of the projects as a cylinder, it becomes apparent that the highest power, for a given amount of fuel, can be achieved by coordinating the operation of the cylinders so that they work together to reinforce each other and certainly not conflict. The very best effects can be achieved from a smart city program when all the individual project elements are in alignment and working together. This is especially true when it comes to smart city data and analytics. While the individual projects will make progress, there is a great danger that similar data will be collected many times. It is also highly likely that the data that is collected and utilized will be stored in project-specific silos that will achieve the objectives of the project but miss a valuable opportunity, for the greater good of the city. This opportunity lies in creating a city or enterprise-wide view of the data that will enrich the analytics and provide a complete picture of all the data, feeding multiple city departments.

In the past we deliberately partitioned the data into smaller chunks to make it more cost-effective and more manageable. Data science has moved on and this is no longer a requirement. We have data management and analytics tools to deal with huge data sets in a very cost-effective manner. It is now possible to keep the data in a “glob”, an unstructured data mass, then invest in the structuring required just as the queries are launched. Results of analytics can also be kept in what has come to be known as a “data lake”.

The cost of data storage and management has been reduced considerably over the past few years. In the past, it could cost hundreds of millions of dollars to store and manage data. In the new world of data science, we’re probably talking somewhere between US$10,000 and US$50,000 per terabyte per year to support the entire data ingestion, data storage and data management process. For example, if a smart city is accumulating a terabyte of data every day (about 30 TB per month) it would cost somewhere between US$2 million and US$10 million per year to store and manage smart city data.

This seems like a big number when you look at it in isolation but when you consider the average city budget in the largest 100 cities in the US is more than US$2 billion per year, that would represent a paltry 0.5 per cent of a smart city’s annual budget. These are also likely to be high estimates as the latest techniques and data science allow us to manage data to put the data we need less often (loosely coupled) in cheaper archives or “cold storage” and the data that needs to be tightly coupled to the organization and used on a regular basis into the more expensive “hot” storage.

So, the cost is relatively low and likely to be even lower when detailed calculations are conducted and data science progresses even further. Now what about the value?


There are two major aspects related to the value of data. They both depend on the value of data being realized by converting it into information, then insights and understanding regarding the operation of the city or enterprise. The first aspect is public sector use of the data to improve the efficiency of smart city service delivery and the overall optimization of the city environment. Examples could include on-demand waste collection based on garbage can sensor data and the demand-actuated approach to garbage collection. Another example might be Mobility as a Service (MaaS) offering travelers a range of options along with details on cost, time, and reliability of the service. The availability of the right data also enables public services for smart city transportation to be optimized. Planners, managers, and operators are provided with the insight and understanding required to better match supply and demand as it varies over time and space.

Having an appreciation of the cost of potential benefits of building a data lake, the next question is how to get started. The obvious thing to do is stop throwing the data away. We seem to have a low perception of the value of data and a concern that data is expensive to store and manage. Both are misconceptions. It is correct to assume that data is of limited value. It is a raw material that needs further work to release its value. However, if you do not have the raw material, then you have nothing to work with. This creates what my dear friend Prof. Kan Chen would describe as a “chicken and egg problem”. We do not value data and so we do not keep it and because we do not keep it we cannot realize and understand the value of it. It becomes obvious that it is necessary to move to the end of the process to effectively demonstrate and illustrate the value of data. Only by witnessing the insight and understanding that can be delivered by big data and analytics can we be inspired to cherish and appreciate data.

You have probably noticed that the treatment of costs was much more numerical and objective than the treatment of benefits above. This points to one important aspect of our data market approach – an effective ability to place a value on data, based on the use to which the data will be put and the benefits to be realized from the information extracted from the data.


Here, concisely, is a proposed approach to the creation of a data lake via a data market project. The intention is to outline an approach to address the “chicken and egg” problem by introducing a signature project – the data market, as an early reason for building a smart city data lake. Many developing smart cities in the US have already understood the value of an integrated data exchange that takes data from multiple sources and makes it available to a wide range of people. This of course assumes that people will use the data once it available.

We can go even further and create a data lake with advanced analytics capabilities, using the data market as a starting point. An effective approach to a data market would incorporate cost-sharing between public and private sectors, availability of both free and premium data, source data tracking and a data valuation mechanism. It would harness the power of data science to establish and manage an effective means to extract value from data. It would also balance the need for public stewardship of money invested in data collection and management with the needs of the private sector to establish new markets and services.

An effective approach to a data market would incorporate cost-sharing between public and private sectors, availability of both free and premium data, source data tracking and a data valuation mechanism

A good next step would be the creation of a public-private partnership to develop a prototype data market that can then be scaled and finalized into a working version.

In addition to the benefits of a data market, this would also unlock a clear demonstration on how to harness the value of data and enabling analytics, and how data can be a coherent factor in the development of a unified smart city.

We have a wonderful opportunity presented to us to harness the power of data science and build a bridge to smart city needs and objectives. Taking the first steps towards accumulating data in a central repository and applying advanced analytics would deliver some amazing value to smart cities. It would also provide documentary evidence for things that you intuitively know already. For example, you probably already know that your city is smarter than the average city, but do you have the analytics to prove it?


Bob McQueen is the CEO of Bob McQueen & Associates and is also North American Bureau Chief for Thinking Highways