TeraTeam UK Teradata DBA Definition
Join types:
In order for related data to be retrieved from multiple tables within a relational database they must be joined using values in their related columns.
Within the Teradata RDBMS parallel architecture the below 'rules' apply:
• For 2 rows, in 2 tables to be joined they must be on the same AMP.
• If 2 tables are being joined on columns which include the Primary Index, and the Primary Indexes have the same data type, they will already occur on the same AMP. If not, then Teradata will move the rows of one or both tables so that potentially matching rows will be on the same AMP, either in Spool or in Memory.
This will be accomplished by either
A) One or both tables will be redistributed. If the Join is based on the Primary Index of one table, then only one table will be redistributed.
or
B) One table will be duplicated across all Amps.
Only those columns and rows that are required by the SQL code, will be involved in the redistribution or duplication.
• Join Processing never moves or changes the original table rows in any way.
• The performance of the Join is largely determined by the number of rows which are selected to be involved in the join from each table.
• The sequence that Tables/Views occur in the SQL code, does not influence the sequence that Tables/Views are joined. This sequence is determined by the Optimiser.
• If a Join involves the columns of the Primary Indexes, and in addition other columns are involved then Teradata will not need to Redistribute the tables because the data will already be suitably distributed.
Join Processing may suffer from skewing where the choice of join columns is poor. Please see Skewing for more information.
Once the data is on the correct Amp, it is sorted into row-hash sequence.
Types of Join:
• Nested Join – Uses Unique indexes to retrieve a single row from one table and join to one or more rows from another table using an index.
• Row ID Join - Uses Memory to store rows from one Table. Either: matches each row from TableB with each row in the Cache. Or: Sorts the rows of Table B by row hash before employing a sorted matching operation, similar to Merge Join.
• Product Join - Preferred by optimiser if one table is small (typically a lookup table), and the join does not use the Primary Index of the larger table. In this case Teradata will Redistribute the larger table, or Duplicate the smaller one. There is no sort but there are many comparisons. Often a product join is the result of poor coding and results in a major performance issue. Contact TeraTeam UK for more on the causes and solutions of problem Product Joins.
• Merge Join - A Merge Join is often chosen where the Join condition is based on equality and therefore it is the most common form of Join.
• Exclusion Merge Join - Normally arises from a ‘NOT IN’ subquery or a ‘Minus’ operation.
• Inclusion Merge Join
• Hash Join – A type of merge join, uses memory or cache for one half of join and joins by hash to an unsorted spool.



