Schedule - PostgreSQL Development Conference 2024

Performance Improvements of Partitioning: Past and Future

Date: 2024-05-30
Time: 14:30–15:20
Room: Fletcher(1900)
Level: Advanced

Declarative partitioning, initially introduced in PostgreSQL 10, is one of the most essential features of PostgreSQL, and user demand for partitioning is increasing year by year. Many developers have worked on improving its performance, especially in handling highly partitioned tables, and partition pruning was sped up in PostgreSQL 11. However, users often have to deal with workloads where pruning is not available, i.e., where most child partitions are involved in the queries. Planning for such queries remains challenging even in 2024. I found that the planning time increases rapidly in quadratic time complexity (O(n^2)) with respect to the number of child partitions (n). This is because the long-standing code in the planner does not address the equivalences of join operations in such highly partitioned cases.

To solve this problem, I have proposed a patch to improve the planning performance in pgsql-hackers [1]. The patch reduces the complexity to linear (O(n)) and demonstrates a 10x or more speedup for 1000 partitions in the Join Order Benchmark. The planning time can even go from minutes to seconds. This improvement is tremendously valuable in this era of big data, where users must deal with hundreds of partitions.

In this talk, I will first look back at how PostgreSQL has improved its partitioning performance so far. Then, I will detail the above problem with planning highly partitioned tables and explain how I solved it. This part will include a deep dive into the planner, especially the planning of join operations, and will clarify where the bottleneck is in the current PostgreSQL implementation. This knowledge will be helpful for PostgreSQL developers to avoid slowing down partitioning. Lastly, I will look ahead to the future of PostgreSQL's partitioning features.



Yuya Watari