MySQL PHP Geolocation - Zip codes: latitude, longitude, distance

This is one of my first really complex SQL query entries. The premise goes like this: you have a user database with a bunch of people who signed up for your service, which includes the user's zip code. Now you want to write a query so you can show user A all the other users in your database that are within, say, 10 miles of her zip code.

It's called Geolocation.

In order to do this, you need two things: 1. a database of all US zip codes, and their corresponding semi-exact latitudes/longitudes; 2. someone like me to teach you spherical geometry.

Lucky for you, you've come here, because not only do I have a nice 7-zipped archive of a US zip codes to latitude / longitude database for you to download, this also happens to be my website, so I can teach you geometry as well!

Now here's a little PHP/SQL for the actual mathematical processing (nearly all done in the database!):

  1.  
  2. // connect to the database
  3. mysql_connect('localhost', 'username', 'password');
  4. mysql_select_db('awesomedb');
  5.  
  6. // we need the userid of the person who is doing the searching:
  7. $userid = 1;
  8. // and the max distance (in miles) you want to show results from:
  9. $max_distance = 10; // in this case, 10 miles
  10.  
  11. // now we go out to the database and get that user's latitude and longitude based on the user's zipcode
  12. $user_query = mysql_query("SELECT z.latitude, z.longitude FROM users u, zipcodes z WHERE u.id='$userid' AND u.zipcode=z.zipcode");
  13. if (mysql_num_rows($user_query)) {
  14.         $user_data = mysql_fetch_array($user_query);
  15.         $latitude = $user_data['latitude'];
  16.         $longitude = $user_data['longitude'];
  17.         // then we use that latitude and longitude to write our complex SQL query:
  18.         $complex_query = mysql_query("SELECT u.id, truncate((degrees(acos(sin(radians(z.latitude)) * sin(radians(".$latitude.")) + cos(radians(z.latitude)) * cos(radians(".$latitude.")) * cos(radians(z.longitude - ".$longitude.")))) * 69.09),1) as distance FROM users u, zipcodes z WHERE u.id != '$userid' HAVING distance < $max_distance");
  19.         if (mysql_num_rows($complex_query)) {
  20.                 $i = 0;
  21.                 while ($row = mysql_fetch_array($complex_query)) {
  22.                         // this is a nice array of userids that are within 10 miles:
  23.                         $found_user_ids[$i] = $row['id'];
  24.                         $i++;
  25.                 }
  26.         } else {
  27.                 // no results!
  28.                 echo 'Sorry, there are no other users within '.$max_distance.' miles of you';
  29.         }
  30. } else {
  31.         // that user's zipcode is not in the database
  32.         echo 'Sorry, that zipcode is not found in the database!';
  33. }
  34.  

There is one little chunk in the $complex_query that was new to me the first time I used this code, and that was the HAVING portion. In the past, I've always tried to use a WHERE statement to limit results based on a calculated column (in this case, the distance column that is the result of the complicated mathematical formula) but I would get an error since the WHERE portion of a SQL statement is actually required to gather the SELECT data, so data from the SELECT cannot be used to limit itself (take a breath and think about it, it does make sense). Enter HAVING.

I know I told you I'd teach you geometry, but I don't actually completely understand what the math above does on a figure-to-figure basis. I do know that is calculates the mean distances between every user in the database. And I know it does so taking the curvature of the Earth into consideration. Think of it like this: if you go near the North Pole, you can walk over every line of longitude (the vertical lines on the globe) in a few seconds by simply walking all the way around the North Pole. But as you get closer to the equator, the distance becomes greater between lines of longitude. So you need SINs and COSINs and Radians and stuff.

At the top, the lines are close together. In the middle, they are further apart. You can I are somewhere in the middle, unless you live in the tropics, or you're Santa Claus, which would be cool, I mean, how many websites does Santa actually visit?

Share/Save/Bookmark