Relational Algebra – Introduction
🧠 What is Relational Algebra?
Relational algebra is the theoretical foundation of relational databases.
It provides a set of operators used to manipulate relations (tables) and produce new relations as results.
Think of it as:
A formal language that allows you to query, filter, and combine data using mathematical operations.
🧾 Why it matters:
- It’s the core logic behind SQL queries.
- It’s used by database engines to optimize and execute queries.
- It helps you understand how queries work under the hood.
🎯 What it works on:
- Input: Relations (tables)
- Output: New relations (tables)
- Every operation always produces a valid table
📦 Example Use Case:
Say you have a table of Students and another of Exams.
Using relational algebra, you can:
- Project only student names
- Filter exams with grade > 24
- Join students with their exams
- Group results by student
- ...all through formal operations.
🔁 Summary:
Relational algebra is the mathematical toolkit that allows us to work with data in a relational database.
Relational Algebra – Operators
🧠 What does this mean?
Relational algebra provides a set of operators that take one or more relations (tables) as input and return another relation as output.
These operators let us:
- Extract data
- Combine data
- Transform data
- Group or summarize data
✅ Key Idea from the Slide:
“Operators to transform relations into relations”
This means:
- Every operator does not modify the original relation
- Instead, it creates a new table (relation)
- These operators can be combined to create complex queries
🧰 Categories of Operators:
We’ll explore them one by one in the coming slides, but here’s a preview of what you’ll learn:

🔁 Why use operators?
Relational algebra is like LEGO:
- Each operator is a block
- You can combine them in many ways to build powerful queries
Projection (π) and Restriction (σ)
🧠 What do they do?
- Projection (π): selects columns (attributes)
- Restriction (σ): selects rows (tuples)
Together, they allow you to extract specific slices of a table.
✅ 1. Projection (π)
Syntax:
πA,B(R)
Meaning:
Return only columns A and B from relation R
📋 Example:
If you have a Students table:

Then:
πName,Age(Students)
Result:

🧠 Projection removes duplicates automatically (like SQL's SELECT DISTINCT).
✅ 2. Restriction (σ)
Syntax:
σCondition(R)
Meaning:
Return only rows that satisfy a condition
📋 Example:
σAge > 21(Students)
Result:

So restriction is like a WHERE clause in SQL.
🔁 Combine Both:
You can use them together:
πName(σAge > 21(Students))
→ Get names of students older than 21
Result:
| Name |
|------|
| Sara |
🎯 Key Takeaway:
Use projection to choose columns, and restriction to choose rows — these are your first tools to filter and shape your data.
Union (∪) and Difference (−)
🧠 What are these?
These are set-theoretic operators that work on two relations with:
- The same number of columns
- The same attribute types in corresponding positions
✅ 1. Union (∪)
Syntax:
R ∪ S
📋 Meaning:
Combine all rows from both relations R and S
Duplicates are removed (just like SQL's UNION)
🧾 Example:
Let’s say you have:

Then:
R ∪ S
→ Result:

✅ 2. Difference (−)
Syntax:
R − S
📋 Meaning:
Return rows that are in R but not in S
🧾 Example:
Using the same R and S as above:
R − S
→ Result:
Name
Ali
Because only Ali is in R and not in S
⚠️ Rules:
- Both relations must have the same schema (same number and type of columns)
- Operate like mathematical sets (no duplicates in results)
🎯 Key Takeaway:
Use Union to merge results, and Difference to filter out values from one set using another.
Cartesian Product (×)
🧠 What is it?
The Cartesian product (also called cross product) takes two relations and returns a new relation that contains all possible combinations of rows — one from each table.
✅ Syntax:
R × S
📋 Slide Example:
Let’s say:
Table R (A):


Every row in R is paired with every row in S.
If R has m rows and S has n rows, the result will have m × n rows.
🧠 Why is this useful?
While it may look strange on its own, Cartesian product is the foundation for joins.
A join is just a Cartesian product followed by a filter (a restriction).
Example:
R × S → σ(R.id = S.id)
⚠️ When is it dangerous?
If you accidentally do a Cartesian product without a condition, you may get a huge table with meaningless combinations.
🎯 Key Takeaway:
The Cartesian product gives you all combinations of two tables — it’s powerful, but use it carefully, usually as part of a join.
Other Operators in Relational Algebra
This slide introduces 3 key operations:
✅ 1. Renaming (δ or ρ)
Sometimes we want to rename a relation or one of its attributes.
Syntax:
δA→B(R)
📋 Meaning:
Rename attribute A to B in relation R
🧾 Example:
Let’s say you have:

Then:
δA→X(R)
→ becomes:
X
1
2
This is useful especially when joining two relations that have columns with the same names, to avoid confusion.
✅ 2. Intersection (∩)
This operator finds rows that exist in both relations.
Syntax:
R ∩ S
📋 Meaning:
Only keep the rows that are common to both R and S
🧾 Example:
If:

Then:
R ∩ S
→ Result:
Name Sara ✅ 3. Join (θ-Join and Natural Join)
Joins are the heart of relational queries. They combine two relations based on a matching condition.
3.1. Theta Join (θ-join)
Syntax:
R ⨝ R.A = S.B S
📋 Meaning:
Combine R and S, but only include rows where R.A = S.B
3.2. Natural Join (⨝)
Syntax:
R ⨝ S
📋 Meaning:
Automatically joins R and S on all attributes with the same name.
It does two things:
- Matches rows based on common attribute values
- Removes duplicate columns in the result
🧾 Natural Join Example:

Then:
R ⨝ S
→ Result:

🎯 Key Takeaway:
These operators give you more power to combine, reshape, and clean your data — especially for multi-table queries and transformations.
Group By & Aggregation
🧠 What is Group By?
Group By allows you to:
- Group rows that share the same values in some attributes
- Apply aggregation functions like
count, min, max, sum, avg to each group
In SQL, this is like:
SELECT Student, COUNT(*), MIN(Grade)
FROM Exams
GROUP BY Student;
📌 Slide 1.7 Example:
Original relation:


Each row summarizes one student’s data across all their exams.
✅ Slide 1.8 – Group By Syntax:
{Ai}γ{fi}(R)
Where:
{Ai} = grouping attributes{fi} = aggregation functions like:count(*)min(A)max(B)sum(C)avg(D)
✅ Slide 1.9 – How it works:
Step-by-step process:
- Sort the relation on the grouping attributes
{Ai} - For each group, compute the aggregation functions
{fi} - Return one row per group, with the
{Ai} values and computed results
✅ Slide 1.10 – Group By as Projection:
Group By is a special kind of projection that:
- Projects only the group keys and aggregated values
- Returns one row per group
Example:
{Student}γ{Count(*), Min(Grade)}
Produces
🎯 Key
Takeaway:
Group By helps you summarize data by category — useful for reports, stats, and insights.
Algebraic Equivalence
🧠 What does this mean?
Algebraic equivalence is the idea that different relational algebra expressions can produce the same result, even if the order of operations is different.
Why this matters:
It gives the query optimizer flexibility to rearrange expressions for better performance — without changing the output.
✅ Two Main Goals of Using Equivalences:
- Change Join Order
- Joins are expensive. Changing the order can dramatically speed up queries.
- Example:
- Join smaller tables first → reduce size early.
- Push Selection Down
- Apply filters (σ) as early as possible, before joins or projections.
- This reduces the size of intermediate results.
🔁 Simple Examples:
Let’s say we have a big table and we want only rows where Grade > 25.
Instead of:
πName(σGrade > 25(Exams ⨝ Students))
The system might optimize it to:
πName((σGrade > 25 Exams) ⨝ Students)
→ It applies the filter earlier so the join has fewer rows to deal with.
🧠 This is a lot like how SQL works:
When you write a complex SQL query, the database engine uses these algebraic rules to find the fastest way to get the answer — even if it rewrites your query internally.
🎯 Key Takeaway:
Relational algebra allows multiple equivalent expressions, and the optimizer chooses the one that runs most efficiently.
Equivalences Over Syntax Trees
🧠 What’s a Syntax Tree in Relational Algebra?
A syntax tree visually represents how relational algebra operations are nested — from the deepest (inner) to the top-level operation.
Each node in the tree is:
- An operation (like selection
σ, projection π, join ⨝) - With input relations or sub-operations as child nodes
🔄 Why this matters?
We use the tree to:
- Visualize how a query is executed
- Understand where selections and projections are applied
- Identify where we can optimize the expression
✅ Optimizations Using the Tree:
🔹 1. Push Selection Down the Tree
Before:
πName(σGrade > 25(Exams ⨝ Students))
Tree:
πName
|
σGrade>25
|
⨝ (Join)
/ \
Exams Students
After optimization:
πName
|
⨝
/ \
σGrade>25 Students
|
Exams
- Here, we filter Exams first, reducing the size before joining with Students.
🔹 2. Reorder Joins or Apply Projections Earlier
The tree can be rewritten in many ways that still give the same result — this is where algebraic equivalence becomes visual and practical.
🎯 Why Syntax Trees Matter:
- They let database systems plan the best path to compute a query.
- Helps you understand query plans and write better queries yourself.
- Useful in database theory, query optimization, and exam questions.
Non Set-Theoretic Operators
Until now, all relational algebra operations followed set theory rules, meaning:
- No duplicates in results
- No specific order of rows
But real-world systems (like SQL) often need extra features like:
- Keeping duplicates
- Sorting rows
So this topic introduces extensions to the standard relational algebra to support those.
✅ 1. Multiset Projection (πᵇ)
Syntax:
πᵇA1,A2(R)
This is a bag (multiset) version of projection.
🔁 Difference from normal π:
- Does not remove duplicates
📋 Example:

- Standard projection would return:
Ali, Sara - Multiset projection would return:
Ali, Sara, Ali
This mimics how SQL's SELECT works without DISTINCT.
✅ 2. Sorting (τ)
Syntax:
τA1,A2(R)
Sorts relation R by attributes A1, A2, etc.
📋 Example:
If you want to sort students by grade:
τGrade(Exams)
This operation adds order to the result — useful for displaying top results, rankings, etc.
🧠 Why are these called “non-set-theoretic”?
Because:
- Sets don’t allow duplicates →
πᵇ does - Sets have no order →
τ adds order
These operators move us closer to real-world behavior (like in SQL), but they go beyond the original pure relational model.
🎯 Final Takeaway:
These extra operators help relational algebra support more practical tasks, like sorting and keeping duplicates — just like SQL does in real databases.
Awesome! Here's a quick summary of all topics from the Relational Algebra PDF — a perfect review for studying or refreshing your memory 🧠📘
✅ Relational Algebra – Summary (Topics 1.1 to 1.13)
1. Introduction (1.1)
Relational algebra is a formal language that defines how to query and transform data in relational databases. It operates on relations (tables) and produces new relations.
2. Operators Overview (1.2)
Relational algebra provides operators to:
- Select rows (restriction)
- Select columns (projection)
- Join, combine, sort, group, and more
3. Projection & Restriction (1.3)
- Projection (π) selects specific columns
- Restriction (σ) selects specific rows
- Together, they are the basic tools to filter data
4. Union & Difference (1.4)
- Union (∪) combines rows from two tables (no duplicates)
- Difference (−) returns rows in one table but not the other
- Both require same schema
5. Cartesian Product (1.5)
- Combines every row of one table with every row of another
- Foundation for joins, but can produce huge outputs
6. Other Operators (1.6)
- Rename (δ) renames attributes or relations
- Intersection (∩) keeps rows common to both tables
- Join (⨝) connects rows based on matching values
- Natural Join uses columns with same names
- Theta Join uses custom conditions
7–10. Group By & Aggregation (1.7 to 1.10)
- Group by ({Ai}γ{fi}) groups rows and applies aggregation like:
count, min, max, sum, avg- Works like SQL's
GROUP BY - Projects one row per group with calculated values
11. Algebraic Equivalence (1.11)
- Multiple relational expressions can give the same result
- Optimizers rearrange expressions for better performance
- Example: push
σ down before join
12. Syntax Tree Equivalence (1.12)
- Queries can be visualized as trees
- Helps optimize query execution by reordering operators
13. Non Set-Theoretic Operators (1.13)
- πᵇ (Multiset Projection) keeps duplicates (like SQL without DISTINCT)
- τ (Sorting) orders rows by specified attributes
- These are not part of classic relational algebra but mimic real-world behavior