21st Century Mapping: An 8 part series ; Part 6: Geographic Big Data

The disputable claim, 80% of all data is geographic, is made over and over again by big players in the GIS community, despite providing a reference to a study or data set that validates this claim. Spatial data companies like this statistic because it convinces large companies to include a geographic component into their data aggregation, and use programs like Esri to organize and manipulate their data. Geographic data is a data set or database system that has some spatial information such as coordinates or addresses associated with the other information included in the data set; traditionally collected using ground surveying, photogrammetry and remote sensing. As remote sensing technology becomes more accessible, data collection extends to laser scanning, mobile mapping, geo-located sensors, geo-tagged web contents, volunteer geographic information (VGI), global navigation satellite system (GNSS) tracking and so on.  


Today, we will focus on the data that is collected from the general public every day. VGI relates to the data that is sent to a database from consumers instead of being generated by a sensor or data collection study. According to a study conducted by the Pew Research Center in 2018, 81% of Americans own a smartphone. These devices that we carry around everyday are collecting data, attaching the gps coordinate that our phones register, and transmitting this information to database systems that have been colloquially dubbed, “Big Data”. Before VGI became a wealth of information that companies could mine to make decisions about consumer choices, expensive and often inaccurate data studies were conducted. The smartphone changed the dynamic because the corporate world no longer needed to conduct data collection, the consumers just began transmitting their data straight to the databases of these companies.


Amazon, everything under the sun’ style company, offers their consumers thousands, if not millions of products, entertainment, and services through their web hosting service. In order to provide a more tailored experience, they collect data on their users so that the algorithm can better predict what you might want to buy, From the moment you log onto the amazon page, it creates a data file associated with your amazon account or the ip address of the computer to know where you are located, what your average income is based on the area you live, and how much time you spend looking at respective products. They take the information collected on you, compare your consumer choices to other buyers, and use external databases to understand the area you are buying from to create a customer profile. Using this 360 degree approach to understand their consumers is one of the major ways that “Big Data” gets used in our society. 


Google is a database company. Despite now having their hands in telephones, wearable technology, self-driving cars, and pretty much any other emerging technological field, their main users interact with the search engine database. Over 3.5 billion search queries are made every day, sending their algorithm to search through 20 billion (and growing) web pages. Their algorithm sends ‘bots’ to scan all the webpages in the world, querying keywords and creating a record attached to that page. So, if you google something, the results are shown based on the closest match of keywords. Back in the early 2000’s the algorithm would use each individual word to find results, however they have moved away from this index based search to a semantic system that would return search results based on the meaning between the words in the search query called Universal Search, first implemented in 2007. As their database grew and the data they collected in search results began creating customer profiles, they introduced Knowledge Graph in 2012, which uses your search history, purchasing habits, frequented websites, and all the information attached to your Google + account to sell you a better web experience. Searching, and using google is free of monetary commitment, but when you open their browser and begin typing, you are paying with your data. They use the profile you allow them to create to sell advertisements and make suggestions that google profits off. 


Data security has become a wide topic of conversation in the 21st century and will continue to lead to debates about how we let these companies use or misuse our data. While the now common GIS world adage “80% of all geographic data” might not have a direct origin point, the saying holds relatively true. Maps are an integral way that big data are understood in terms of geographical trends, and can be the deciding factor in where products are advertised, and made available to consumers. The conversation around big data is too large for just one blog post, but the argument, that these data points are aggregated using spatial data, stands true. Some users combat this type of data collection by using a VPN (Virtual private network) to throw off the location, and other services like ad blockers and turning off cookies can help prevent the collection of big data, however this is not at the forefront of the minds of the average internet user. I think that as we move into a new century, conversations about how we volunteer our data will be increasingly scrutinized, but for now, this is the reality we operatie in. Our data, just like venmo, paypal, or bitcoin, is just another form of internet currency. Keep that in mind next time you open google. 


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s