The proliferation of smart devices coupled with the ubiquitous internet has led to huge amounts of data being created every day. Analysts estimate that at least 2.5 quintillion bytes of data are created daily. And these numbers are expected to go even higher, especially due to the growth of the IoT. If the prediction of Singapore Management University is anything to go by, there will be a 4,300% growth in the annual generation of data by 2020. Clearly, big data is huge (pun intended). The easiest way to understand big data is to describe it using the 5Vs of big data namely;
- Velocity refers to the speed at which data is generated.
- Volume refers to the size of the data that is generated.
- Veracity indicates the trustworthiness and consistency of data that is being generated.
- Value of big data refers to how the data benefits the underlying processes.
- Variety of big data refers to the complexity of data e.g. structured data, semi-structured data, and mixed data. (text, documents, videos, audio etc.)
Is there a problem with Big data?
Even though data plays an integral role in any business, big data can also create unnecessary complexities and problems in the business process. This is why businesses are now focusing on converting Big Data into Smart data. Here are some of the main problems of Big data.
Big data is generated from heterogeneous sources and as a consequence, it will be in different formats and representations. Management of Big data, therefore, calls for multi-dimensional management tools. These tools must not interfere with the veracity of the data. For instance, the data stores to be used must offer scalability and elasticity. Business Intelligent systems can be used to manage Big data. But most organizations are already built on traditional platforms so migrating data is not only resource intensive but also a complex process.
The exponential growth of Big data creates a unique storage problem. The chunk of the world’s information is trapped in unstructured big data. Most of the existent systems have storage of up to 4 terabytes a disk while Big data is populated in exabytes. It can be very confusing to tell which data must be stored and which should be skipped. The alternative is to store the data in the cloud but it will also take quite a long time to transfer all data to the cloud.
Data needs to be processed and the results delivered quickly in order for it to be useful in a dynamic business environment. Most business that transformed from brick and mortar models ended up creating a data storm that they were not well equipped for. Even with the introduction of technologies that use advanced indexing schemas, processing exabytes is still a challenge for the average organization.
What is Smart Data?
Smart data is arrived at when various big data sources are amalgamated, correlated and analyzed in order to facilitate better decision making. In simple terms, it is the process of converting big data into information that actually makes sense.
For instance, if you have a large list of numbers of all the recorded sales, you may want to transform the data to make it more palatable, like say identifying the peaks of the sales volume. Algorithms can be used to transform big data into insights that can be used for decision making.
The collection of huge chunks of data without adding a layer of intelligence to transform the data into smart insights will bring little benefit to a business. Smart data was inspired by Ben Othmane’s Active Bundle technology (AB). AB consists of (1) metadata, (2) sensitive data and (3) virtual machine.
Metadata and sensitive data components contain descriptions of the sensitive data and its use while the virtual machine manages the operations of the AB. Smart data follows this same paradigm. However, smart data does more than merely protecting the data in the bundle because it is an intelligent unit that evolves and participates in the operations of a given IoT application.
The role of contextual metadata (external data) in analytics
In 1999, NASA lost the Mars Climate Orbiter as a result of metadata inconsistency. A small metadata inconsistency cost NASA a $125 million satellite. This example underscores the importance of metadata. Metadata describes other data and it, therefore, helps in providing context.
Contextual metadata plays a huge role in the accurate interpretation of data by both humans and machines. Metadata is useful in analytics on three main tiers. The first tier deals with a developer’s immediate needs like the determination of the internal structure of a file in order to read it successfully.
Tier number two deals with the broader implementation of strategic issues while the third tier is concerned with the development of business-wide strategies. Contextual Metadata comes in handy when migrating, consolidating, analyzing, discovering, and integrating data.
Data is arguably the most important asset in organizations. Data stores all the records in the business process and when subjected to the right analytics, the data can furnish management with insights into the future. This way, the business can spend more time focusing on areas that have been tested and proven to work.
Clearly, leveraging data the right way is key to the health of a business. Metadata is what empowers business to leverage their data effectively across the board. Metadata gives a 360-degree angle of data which helps to promote its understanding. It wouldn’t be possible to leverage data effectively without understanding it.
Why good data is the key to unlocking AI for businesses
The future of AI can be summarised in one phrase – machines will get more intelligent than artificial. The machines will rely less on big data and more on smart data for decision making. This will enable machines to use a human-like approach in solving problems which will make it much easier to deploy AI across different industries. Traditionally, AI was built through machine learning which relied heavily on big data.
But these neural networks have limitations when dealing with situations with limited data. For instance, a driverless car might have challenges processing an anomaly like say, children in their Halloween costumes moving to and fro across the road. Another interesting illustration of this is how iPhone’s face recognition system struggles when identifying someone that has just woken up.
The solution is to start building systems that are more intelligent than they are artificial. And this can only be achieved through smart data. Without smart data, AI loses its intelligence. Here are some practical illustrations that serve to show why good data is the key to unlocking AI for business.
Data-hungry approaches can be overcome through the modeling of what a human expert would do when faced with high uncertainty but with little data. Siemens has used this approach to control the complex process of combustion in gas turbines. How long the turbine operates is typically determined by a number of variables.
Building efficient machines would, therefore, require running the machines for almost a century to get enough data for training. The Siemens engineers overcame this challenge by using methods that rely on little data and the machines are set to continuously seek for the best solution in real-time, almost like a real human expert.
Robots that have a conceptual understanding of things just like humans can learn things using way less data. Mark Zuckerberg, Marc Benioff, and Jeff Bezos have partnered in the Vicarious project to develop an AI for robots that uses reasoning. Instead of using big data to program the robots, they will learn and adapt based on a couple of general examples. This is an example of how smart data can be used to develop machines that are more human-like in their conceptual understanding of the world.
A number of organizations are already working on teaching machines to use common sense. Machines are taught the day to day actions and objects thereby helping them to deal with unforeseen situations while progressively learning from experiences. Common sense comes naturally to humans but not so for machines. Machines will need some explicit training and smart data on which to benchmark. AI2 is currently developing an assortment of tasks upon which “common sense” can be measured for machines.
The influx of data can be a blessing or a curse depending on how an organization responds to big data. Some companies will take advantage of the big data to differentiate themselves. Others will be completely overwhelmed by the challenges of collecting, cleaning and protecting big data. One thing is for sure though – smart data is the future of AI.
Smart data is the missing link that will lead to the creation of intelligent systems that are intuitive and that have a conceptual understanding of the world.