GeoFlow, SQL Server and Millions of Rows

Over the past few months, I have been asked if it is possible to connect GeoFlow to SQL Server database; and the more intriguing question is, why don’t GeoFlow connect to Tabular Model straight away instead of using PowerPivot. Although I do not know if there is such a plan related to the latter question, I do know that you can visualise data in SQL Server database or Tabular Model using GeoFlow.

The key ingredient for exposing this data is by storing and modelling it in PowerPivot. Why so? Perhaps one of the reasons is that most organisations, big or small, and individuals have or know how to use Excel (yes, they will need Excel 2013). Buying Excel 2013 Professional edition or higher would be cheaper to afford than the cost of buying a license for SQL Server. So, PowerPivot is the middle ground and it is the best win-win situation, for now. Those organisations which have invested time and effort converting their PowerPivot workbooks to Tabular Models however, would probably appreciate being able to expose their data in Tabular Model using GeoFlow.

Exposing SQL Server for GeoFlow

I have recently posted a step-by-step instruction to expose data stored in SQL Server database and Tabular Model to GeoFlow at Microsoft Customer Reviews of STB Products.

In the step-by-step instruction, I did mention that I was using 22 Million of records. The sample workbook is about 580 MB and shared via SkyDrive. PowerPivot in the workbook contains daily data of Internet Speed tests for 5 years and about 2,200 unique locations. The workbook cannot be opened using Excel Web App due to the size and GeoFlow is not supported in Excel Web App either. So, you will need to have Excel 2013 Professional edition or above, and have GeoFlow installed.

I have also created a tabular model based on the 22 Millions records, but only exposing the aggregated data via DAX. The PowerPivot in this workbook contains about 1 Million of records, containing monthly data for 5 years worth of Internet Speed tests conducted in about 2,200 unique locations. The workbook is about 6.2 MB and shared through SkyDrive.

GeoFlow for Excel 2013 - SQL Server Data Refresh 22M rows

 

I am deliberately making the workbooks available via SkyDrive, in case in the future and wouldn’t it be nice if, GeoFlow can be supported in Excel Web App too :)

Wrap Up

GeoFlow Preview for Excel 2013 can connect to SQL Server via PowerPivot. This article has shown how GeoFlow can render geospatial information from millions of rows stored in SQL Server database and Tabular Model.

 

Further Reading

GeoFlow for SQL Server Database and Tabular Model at Microsoft Customer Reviews of STB

Sample Workbook: GeoFlow using SQL Server database

Sample workbook: GeoFlow using Tabular Model

 

 

Speaking at PASS Summit 2013

I am delighted and honored to be selected as a Community speaker for a Regular Session at PASS Summit 2013. My session is “From Impaired to Insightful: Analysis with Geospatial Data“. This year this summit will be held in Charlotte, NC on Oct 15-18, 2013.

I look forward to meeting PASS Summit alumni, speakers and First Timers too! Is this your first visit to PASS Summit outside Seattle? Mine too! Good news, Melissa Coates (@SQLChick) has posted tips on how to get around Charlotte. Last but not least, congratulations to the other selected speakers at PASS Summit! Also special thank you to volunteers and organizers of PASS Summit!

PASS Summit 2013

 

Check out Coupon Code that may be available from your User Group. Malibu SQL Server User Group members should use the code mentioned here.

See you in October!

Revisiting GeoFlow Public Preview

As you may already know, GeoFlow Public Preview for Excel 2013 has been out for nearly a month now. I have posted a couple of articles on GeoFlow, the Beta version and the Public Preview version. I would like to revisit the Beta  post on “What GeoFlow isn’t/doesn’t”. In this post, I would also like to highlight top 5 new features in the Public Preview.

Revisiting The “Not” List

Below is a revised list of  what GeoFlow isn’t/doesn’t, for the Public Preview version.

  1. It does not support SQL Server Geography data type.
  2. It does not support location hierarchy or aggregation level (unlike Power View, where you can double click on Country and it will go to more granular geographical level as long as the model has been setup properly).
  3. It does not work without internet connection. A suggestion on being able to cache the map has bee logged on Connect.
  4. It does not have a “Z” index. Such information may be handy to map air traffic. This conversation came up during SQL Saturday 177 – super smart audience!
  5. It does not support elevation information. Another gem that came up at SQL Saturday 177.
  6. It does not have any tool to export to mp4 or other video file format. Although you can use screen capture software to save the video. This suggestion has been logged on Connect.
  7. It does not support Chloropleth map.
  8. Currently GeoFlow does not seem to support discrete GPU; some laptops, Lenovo and Sony VAIO require the discrete GPU to be disabled. This means that it may not be possible to show case GeoFlow on a larger HD display via HDMI. The issue with VAIO laptop has been logged on Connect.
  9. It does not allow custom map background / ESRI Shapefile. This suggestion has been logged on Connect.
  10. Data refresh is not reflected on the map. A new separate workbook needs to be recreated. GeoFlow seems to import the data from the workbook once. This issue has been logged on Connect.
  11. It does not allow customizable tooltip; i.e. it only uses the information that is used for mapping. This suggestion has been logged on Connect. Update on GeoFlow Preview: A free text form tool tip / annotation box is available though.
  12. It does not support filtering on Category and Height values. The data must be prepared first for filtering prior to creating the first GeoFlow map due to the limitation on data refresh as outlined in number 10. This has been logged on Connect.
  13. GeoFlow is restricted to 1,048,575  rows (+ 1 table heading row) to be precise. This is because GeoFlow is based only on a table that can be created within Excel 2013.
  14. GeoFlow does not give the option to change the colors used for the markers (bubble or column stack bars) from the default colors associated to the themes. This suggestion has been logged on Connect.
  15. GeoFlow does not have an “unwrap” option to flatten / view the globe as a whole. This suggestion has been logged on Connect.

New Features

Top 5 new features in GeoFlow Public Preview:

1. Top 100 chart

2. Geocoding validation on the selected columns

Clicking on 85% on the example above will display the following Geocoding Alerts window:

3. Better UI interface

4. Data Refresh

5. Special Effects on tour mode visualization

 

 

I will be presenting att SQL Lunch this Friday (May 10, 2013) at 9:30AM PDT to discuss the enhancements and the limitations of current GeoFlow Public Preview for Excel 2013. Hope you can join me then!

 

Wrap Up

If you are keen on Geospatial reporting, GeoFlow is definitely worth checking out. There are some limitations in the current Public Preview as discussed above. I think now is a great opportunity to test out the Public Preview and providing feedback / suggestions for a better product either via Connect or GeoFlow Forum.

 

Further Reading

GeoFlow, Is that it, Microsoft? by Jamie Thomson (Worth reading the Comments section)

GeoFlow Product Page on Office.com by Microsoft

GeoFlow Beta - Untapping 3D Visualization by Julie Koesmarno

Public Preview GeoFlow for Excel 2013 by Julie Koesmarno

 

 

Insightful Geospatial Reporting

Thank you for those who attended my “From Impaired to Insightful Geospatial Reporting” session at beautiful Orange County SQL Saturday 201. We started early at 8:30AM on April 20, 2013 which did not deter a lot of you to come in and learn more about Geospatial Reporting.

Below are brief details on my session:

From Impaired to Insightful Geospatial Reporting

Slide Deck

Abstract

It is expected that business reports be insightful and engaging. Region based reporting can be transformed from boring grids to visually interactive maps. This session will describe how to embark upon self-service and mobile BI solutions that are available within the Microsoft BI stack for analytical geospatial reporting. You will learn the basics of geographical data, including GIS, Longitude-Latitude coordinates and free form text. Using these different data types, we will discuss SSRS 2012, Power View and Excel 2013 tools that can leverage insightful analysis. Finally and most importantly, attending this talk will get you started in learning the essential techniques to deliver geospatial reports that tell meaningful stories for users.

Geospatial Reporting Tools by Maturity

From left to right: GeoFlow Public Preview for Excel 2013, Geographic Heat Map app for Excel 2013, Bing Map app for Excel 2013, Power View and SSRS.

 

Feel free to leave comments. This session is a brief summary of my view of the Geospatial reporting tools that are available on Microsoft platform.

Slide decks for other SQL Saturday 201 sessions can be downloaded here: http://www.sqlsaturday.com/201/schedule.aspx. The sessions marked with asterisk (*) on the schedule are those with downloadable materials.

 

SQL Saturday 211 – Geospatial Reporting

Thank you to those who attended my session at SQL Saturday Chicago (#SQLSat211) in Chicago on the weekend. I really appreciate the feedback from you.

The complete demo files are located here:

SQLSat211 – Analytic Cartography – Julie Koesmarno

In this presentation, I discussed from the mature Traditional BI tool such as SSRS to self service BI using Power View and GeoFlow. to mobile BI solution in a small scale such as Excel Apps: Heat Map and Bing Map. Using Earthquake incidents we learned what tools were best at, from data exploration to visually captivating view. Coupling the correct tool with a good story line, we can achieve effective geospatial reporting.

Special thank you to SQL Saturday 201 Organizers, Volunteers and attendees for making it happen!

 

Further Reading

Power View Map in Excel 2013 Preview  by Julie Koesmarno

Plan a Map Report (Report Builder and SSRS) by Technet Microsoft

Introduction to GIS by University of Missouri – St. Louis

Public Preview GeoFlow for Excel 2013 by Julie Koesmarno

Shape2SQL by Morten Nielsen

Spatial Data Series (SSRS) by Jeffrey Verheul

ArcGIS Tectonic Plate Lines by ArcGIS

SSRS Tips / Tricks by Jason Thomas

Public Preview GeoFlow for Excel 2013

If you love geospatial data or geographical data, you should check the Public Preview of GeoFlow. It is an add in to Excel 2013 which renders 3D visualization of geographical and temporal data. The Beta version was announced late last year and I provided a review here. The Public Preview version is a much improved version, from performance, user interface, new features (such as chart) and many others. The Public Preview of GeoFlow was announced this morning and demoed at PASS BA Conference.

Without further ado, the download link is here:

http://www.microsoft.com/en-us/download/details.aspx?id=38395

Here is a teaser from me.

Stay tuned for more reviews from me!

If you are attending PASS BA Conference, check out Jason Thomas’ session (Geospatial Analysis Using Microsoft BI) on Friday (Apr 12, 2013). There may be a demo of GeoFlow there too.

I will also be doing Geospatial visualization talks at SQL Saturday 211 in Chicago on Apr 13, 2013 and SQL Saturday 201 in Orange County on Apr 20, 2013. I will show you how GeoFlow works live!

Official Links from Microsoft
GeoFlow Forum: Post and answer questions for the product team and community
GeoFlow Facebook Page: Share datasets, tours, and blogs
GeoFlow Product Page on Office.com: Download requirements, directs to Web Download page below
GeoFlow Download Page: Download the bits on Microsoft Download Center

Further Reading / Research

GeoFlow Beta - Untapping 3D Visualization by Julie Koesmarno
Analytic Cartography: Master Geospatial Reporting session at SQL Saturday 211 in Chicago
From Impaired To Insightful Geospatial Reporting session at SQL Saturday 201 in Orange County

 

Presenting And Loving It!

This month T-SQL Tuesday #41 is hosted by Bob Pusateri (b | @SQLBob). T-SQL Tuesday is an event where where SQL Family join in writing and reading a new topic given each month. The topic for this month is Presenting and Loving It! So, I would like refer to my personal story, my journey to presenting. 

 

Being Taught and Taught

In my late teens, I started tutoring Mathematics to a few primary school and high school students back in Australia as my second part time job. I quite enjoyed teaching one-on-one; because I got to know the students and tailored the way of teaching. I took the step further and applied to become a lab/class tutor for 1st year students when I was in my 3rd year at Australian National University. I did not like presenting much when I was younger, but I loved sharing knowledge with friends, teaching them how things worked. So I thought tutoring at University was not the same as presenting. I was wrong. It was very similar. The difference is that the agenda for tutoring at University for each week has been decided (which was nice!). So in this case tutoring was easier. But everything else was fairly similar. This was how my attachment to presenting started…

Career in Data

I stopped presenting (teaching) in front of a group when I started my full time job. Eventually in 2011, Victor Isakov suggested that I should go to PASS Summit; and that I should get into presenting. I said to Victor that I had past experience in terms of tutoring, but my skill in SQL Server was not good and should be improved. Victor suggested that presenting is one of the ways for us in becoming better as a professional. I listened to him.

I started presenting with him at Sydney SQL User Group. I was very nervous, but Victor was a natural! I really enjoyed delivering a presentation with him and thought that he was right all along about becoming better at it because I really had to research the subject matter. I also met lots of new people from there and the audience were supportive. I thought, “Hey, I know I have a long way to go to become natural at presenting, but I could work on it, right?!”. So since then, I have decided to submit more presentations. I do not know everything about SQL Server, but I do know a lot more than I did 1.5 years ago; and I plan to learn more through presenting. What I love the most about presenting is the discussions; we all learn from each other.

Presenting and Travelling!

Through presenting, I have managed to visit a number of cities outside Australia and make friends from different parts of the world. Besides in Australia, I have presented in USA (Seattle, Dallas, Silicon Valley) and Canada (Vancouver). Australia, USA and Canada so far have proven to be similar but with subtle differences; from culture, audience participation, to how they serve their bacon!

My next trip is to Chicago for PASS BA Conference tomorrow (Apr 10, 2013), where I will be presenting with my friend, Joshua Fennessy (b | @JoshuaFennessy). I will also be presenting at SQL Saturday 211 on the Saturday (Apr 13, 2013) immediately after PASS BA Conference. The week after that, I will be presenting at SQL Saturday 201 Orange County (Apr 20, 2013). Hope to see you at these events and future ones too!

 

  SQL Saturday 211 Chicago  

 

Wrap Up

Thank you, thank you, thank you and a thousand times thank you! Thank you to the event organizers who have given me a chance to speak. Thank you to the audience for attending/tuning into my presentations. Thank you to my #sqlfamily for giving me encouragement to be a better speaker; directly and indirectly. Thank you to all other speakers out there for putting the time and effort into sharing your knowledge with the community.

Thank you Bob Pusateri for hosting this month’s T-SQL Tuesday!

 

Twitter Analysis on PASS Summit 2013 – Part 1

Two of the most important times in the year as upcoming and existing SQL Server Community speakers are the submission day for PASS Summit and the Summit days (or shall we say “week”?).

I am one of community speakers – nervous, anxious and excited all at the same time about session abstracts and the opportunity to speak in front of other passionate SQL Server professionals. PASS Summit is more than just a conference for most of us – it’s a gathering for #SQLFamily; we share knowledge, we have fun and we bring back lots of good memories as well as new learning.

April 3, 2013 9 PM is the big deadline for Community speakers. Using Analytics for Twitter I was able to gather some information on how busy the community was today – discussing SQL PASS and PASS Summit itself. Without further ado, here are the charts.

Let me see!

Analysis based on today’s date up to 9:15 PM (April 3, 2013)

 

Analysis based on location the past week.

 

Tone analysis the past week

Note: The time play tone analysis has been replaced with Line Chart instead for easier viewing on the website.

If you are interested in tone / sentiment analysis on social network, below is a good session that cannot be missed at PASS BA Conference on April 10 – 12, 2013. Hope to see you there!

Social Text, Sentiment, and Tone Analysis by Ruben Pertusa Lopez  Paco Gonzalez

 

Wrap Up

Using Analytics For Twitter add-in, we can analyze the buzz around Community Speaker submission for PASS Summit 2013. The submission ended on April 3, 2013 at 9 PM (PST).

I am certain that the conversation about abstract submission will continue for at least another week. Topics discussed will include wishing good luck to other speakers, encouragement, selection announcement, the rules, the system (time out issues), deadline and many others.

I’ll post another article with some more Analytics for Twitter in about a week to follow this up.

 

Further Reading

Analytics For Twitter on Technet

Analytics For Twitter Download by Extended Results

Social Text, Sentiment, and Tone Analysis session at PASS BA Conference, 2013

Now What? by Colin Stasiuk

 

Don’t Drive Your DBA Crazy!

“Don’t Drive Your DBA Crazy!” – that was my message today at PASSWIT session hosted by Pragmatic Works. The idea is quite often we concentrate on business requirements (which is superb by the way) and often forget about what is going to happen when we deploy this solution to our Production servers. This session reviews some areas that database developers should be aware of when developing solutions. It also provides common techniques to tackle high volume data processing challenges, that are kind to our production environment (i.e not driving our DBA crazy). Both SQL Server 2008 and SQL Server 2012 features that are useful for high volume data processing are compared and highlighted.

 

Where Do I Go From Here?

Slide deck is available here:

High Volume Data Processing Techniques without Driving Your DBA Crazy

 

In addition to the links that are mentioned in the slide deck, here are some more pointers for your further research.

Batching

Break Large Delete Operations Into Chunks by Aaron Bertrand

ColumnStore

Columnstore Index FAQ

Better Together: Merge, Partitioning, ColumnStore by Greg Low

Inside the ColumnStore Index by Hugo Kornelis

Columnstore Index – Index Statistics by Ted Krueger

Winter 2012 Performance Palooza (special SQLPASS Performance Sessions)

SSIS

SSIS Operational and Tuning Guide

Top 10 SQL Server Integration Services Best Practices by SQL CAT

Balanced Data Distributor For SQL Server 2012

Balanced Data Distributor For SQL Server 2008

Outside the scope

Partitioning is a powerful technique or feature of SQL Server for high volume data processing. It is only available in SQL Server Enterprise edition. Some say it’s tricky, I think it is elegant as a divide and conquer method. Here is a great article by Kendra Little: How To Decide If You Should Use Table Partitioning.

Wrap Up

The recorded session will be made available on Pragmatic Works site by end of the week.

Lastly and most importantly, thank you Pragmatic Works for hosting PASS WIT sessions in March and for having me as a speaker.

Remember: Use the right techniques for the right situations!

 

 

Study in Human Behavior – Self Experiment

This article is a little different from my usual SQL Server topic. I have been interested in data visualization and sociology research. There are two common ways of conducting a study in Sociology or Human Behavior. Due to the increase of (online) social networking, people’s sentiments and opinions can be gathered / harvested for different types of social study. The PASS Business Analytics Conference in April 2013 offers a session on Social Text, Sentiment, and Tone Analysis by Ruben Pertusa Lopez and Paco Gonzales. The session reviews Natural Language Processing, text mining and data mining techniques to observe sentiment and tone analysis. Such study can be used for analyzing brand reputation, market predictions and automatic learning.

Another way of studying social behavior is through the traditional method of observation and/or questioning. This can be conducted towards a set of population or an individual, depending on the purpose. Both observation and questioning methods carry some pros and cons (source: Science Buddies). These are:

Observation
+ Generally most effective means for studying young children who are unable to respond to questions
+ More convenient and less intrusive for subject
+ Captures an individual’s genuine reactions
- Limited to collecting data about visible characteristics or behavior
- More time consuming to capture sufficient data for conclusions

Questioning
+ Valuable for collecting information on unobservable variables such as feelings, motives, perceptions, attitudes, etc…
+ Usually less time consuming method for capturing sufficient data
- The questions, or the mere fact of being questioned, may influence a subject’s responses.

 

Self Experiment Using Direct Questioning Method

Recently I have decided to undertake a human behavior experiment on myself to understand the complexity in the process of the questioning method. This includes defining variables to control, data gathering and generating data visualization that is easy to understand / to consume. So here it goes. (Please note that at a personal level, I learn quite a lot in the process)

Abstract: The human body is designed to respond to changing environmental stresses in a variety of biological and cultural ways. Altering lifestyle, biologically and in technology consumption, can affect productivity, social activities and mood in one week.

Method: Replacing two regular routines with another familiar routine at higher concentration.  The two routines being replaced are public tweeting and chocolate consumption. The replacing routine is various physical exercises at more intense and regular intervals.

Caveats: People are unique; this experiment is highly unlikely to yield the same result on different people due to difference in biological and cultural difference. Larger population may provide better trend and eliminate data noise.

Note: The difficulty in this study is that I don’t live in vacuum. So there are other factors affecting mood, productivity and social interaction. So there are uncontrolled variables that have not been taken into account into the results. Even if I take the same experiment at another time, I may yield a different result. This could be due to past experience or other uncontrolled variables in life. Yes, human behavior research is not an easy process!

Wrap Up

The traditional way of conducting traditional survey or observation on human behavior can be complex. In the self-experiment I have conducted, it seems to be error prone and is limited to a small amount of data analysis. I look forward to learning the large-scale and automated way of gathering and analyzing social interaction at the “Social Text, Sentiment and Tone Analysis” at PASS BA Conference in April. Hope to see you there!

Further Reading

Designing a Study in Sociology or Human Behavior by Science Buddies

Social Text, Sentiment, and Tone Analysis session at PASS BA Conference, April 10 – 12, 2013