Avoid using unnecessary joins on data tables by using data table relationships in TIBCO Spotfire. Joins can be slow and clutter your data. Here is one way to avoid this.

The Power of Data Relationships and Avoiding Database Joins

Setting up your data in Spotfire is one of the most important tasks. Often the temptation to use traditional methods from databases such as Oracle and SQL Server is great. By this I mean using joins to make the tables into the shape you want, or often doing the join within Spotfire so you have all the columns in one table.

From my experience with Spotfire the best analyses and applications are developed when we stop thinking about how things used to be done, and start thinking about how Spotfire would do it! There are so many tricks with data you can do if you set out your tables correctly and use the right calculated columns. One very important and powerful tool for optimising your data table usage is Data Table Relations.

To set up a data table relationship follow the steps below:

  • Go to the Edit Menu
  • Select Data Table Properties
  • Click on the Relations tab

Manage Relations
Manage Relations Example - Notice that the colour beside each table is different. When the colours match this means tables are related

  • Click Manage Relations
  • Click New
  • Set your tables to use, which columns to be matched and if required the method for each column. The method is very useful if the columns you match are not of the same type. You can force these types to match using the Method option.

Set Data Relationships
Setting up Data Relationships

  • Once you 'Ok' all of this you have a data relationship and the colours shown in your data table properties GUI should match:

Related Data Tables
Two tables are now related

Data Table Relations are key for instance when using the map chart as you often require a related table to colour your map items by. Relating data tables is also important if you want your markings to show as marked in visualisations and tables for related data. Below is an example where I marked some rows in the Reference Table and due to the relationship set up, the related rows in the Raw Data table are also now marked:

Marking From Data Relationships
Example of how markings can be work in multiple data tables using relationships

However an often overlooked functionality is for the ability to filter related data tables by each other. This is extremely useful for instance if you have a table you wish to use or provide users with that has the columns they wish to filter by. When the filters are used however you want all other related tables to be filtered also. You can also use this method to provide a cascading effect whereby filtering one table filters another table, which in turn is related to another table which then also reacts to the filtering. Continuing from my example above I am going to use the relationships set up to make the 2nd table filter when the 1st table is filtered:

  • Open the Filters GUI if it is not already open
  • Notice that there is a small icon (looks like two tables on top of each other) for any data table with a relationship
  • Click on this icon and you are presented with a list of tables related to this table
  • Select the table and you can then choose from 3 options:

  • Include Filtered Rows Only - this is the most common option to use. This table will be filtered to match only the related rows in the related table i.e. if you filter the related table, this table is filtered to the same resulting rows.

  • Exclude Filtered Out Rows Only - this can be a little confusing but essentially if there are rows which are filtered out in your reference table, the related rows will be excluded from this table also. This option is useful but can have some interesting effects if your table does not have a related row in your reference table as it will never be filtered out.
  • Ignore Filtering - this is the default and means the tables do not react to each other.

Below is the screenshot showing these options:

Setting Your Relationship Filtering Rules
Setting Your Relationship Filtering Rules

How does this avoid table joins as I stated at the start? The reason you can avoid table joins is traditionally you would merge in the columns you want to filter by onto your actual data. However by using your relationships and filter correctly you can avoid this. This results in very clean data tables in your Spotfire analysis which haven't been joined multiple times and are very close or identical to how the source data was original stored. This has benefits in the future when trying to maintain, evolve the analysis or even replace data tables. The 'cleaner' your data in Spotfire, the easier it is to make your tool dynamic but also react to changing data, and the structure of said data upon which your analysis and tool rely.

Here is my example in action where I have set the Raw Data table to include only filtered rows in the Reference Data table. I made a nice filter listbox from the NAME column from the reference data table in a text area. I select one name from this and the raw data is automatically filtered.

Example of Data Relationships In Action
Example of Data Relationships In Action

This functionality is one of those items that does not stick out as being particularly amazing until you really start to use it in anger. Once you have multiple data tables which are all related in some way the power of relationships becomes great. Not only does it improve your ability to make data react to each other but when done right it also makes your analyses much cleaner and easier to maintain avoiding unnecessary database joins. It also means you are less vulnerable to changes in underlying data structures.

JoinsrelationshipsdatabaseSpotfirefiltering
Posted by: Colin Gray
Last revised: 29 Nov, 2012 10:44 PM History

Comments

blog comments powered by Disqus