(datafusion-site) branch asf-staging updated: Commit build products

github-bot Tue, 12 Aug 2025 07:20:01 -0700

This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/datafusion-site.git



The following commit(s) were added to refs/heads/asf-staging by this push:
     new 4cf2fe2  Commit build products
4cf2fe2 is described below

commit 4cf2fe218be26373b4310d6db5c1ee4f55ecd29d
Author: Build Pelican (action) <priv...@infra.apache.org>
AuthorDate: Tue Aug 12 14:19:52 2025 +0000

    Commit build products
---
 blog/2025/08/15/external-parquet-indexes/index.html | 5 ++++-
 blog/feeds/all-en.atom.xml                          | 5 ++++-
 blog/feeds/andrew-lamb-influxdata.atom.xml          | 5 ++++-
 blog/feeds/blog.atom.xml                            | 5 ++++-
 4 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/blog/2025/08/15/external-parquet-indexes/index.html 
b/blog/2025/08/15/external-parquet-indexes/index.html
index 285c245..6cd5bbe 100644
--- a/blog/2025/08/15/external-parquet-indexes/index.html
+++ b/blog/2025/08/15/external-parquet-indexes/index.html
@@ -587,6 +587,9 @@ components, rather than as a single tightly integrated 
system.</p>
 improve the project. If you are interested in learning more about how query
 execution works, help document or improve the DataFusion codebase, or just try
 it out, we would love for you to join us.</p>
+<h3>Acknowledgements</h3>
+<p>Thank you to <a href="https://github.com/zhuqi-lucas";>Qi Zhu</a>, <a 
href="https://github.com/adamreeve";>Adam Reeve</a>, <a 
href="https://github.com/JigaoLuo";>Jigao Luo</a>, <a 
href="https://github.com/comphead";>Oleks V</a>, <a 
href="https://github.com/shehabgamin";>Shehab Amin</a>, <a 
href="https://nuno-faria.github.io/";>Nuno Faria</a>
+and <a href="https://github.com/Omega359";>Bruce Ritchie</a> for their 
insightful feedback on this blog post.</p>
 <h3>Footnotes</h3>
 <p><a id="footnote1"></a><code>1</code>: This trend is described in more 
detail in the <a 
href="https://www.influxdata.com/blog/flight-datafusion-arrow-parquet-fdap-architecture-influxdb/";>FDAP
 Stack</a> blog</p>
 <p><a id="footnote2"></a><code>2</code>: This layout is referred to as <a 
href="https://www.vldb.org/conf/2001/P169.pdf";>PAX in the
@@ -601,7 +604,7 @@ with additional engineering effort (see <a 
href="https://xiangpeng.systems/";>Xia
 topic</a>). <a href="https://github.com/etseidl";>Ed Seidl</a> is beginning 
this effort. See the <a 
href="https://github.com/apache/arrow-rs/issues/5854";>ticket</a> for 
details.</p>
 <p><a id="footnote6"></a><code>6</code>: ClickBench includes a wide variety of 
query patterns
 such as point lookups, filters of different selectivity, and aggregations.</p>
-<p><a id="footnote7"></a><code>7</code>: For example, <a 
href="https://github.com/zhuqi-lucas";>Zhu Qi</a> was able to speed up reads by 
over 2x 
+<p><a id="footnote7"></a><code>7</code>: For example, <a 
href="https://github.com/zhuqi-lucas";>Qi Zhu</a> was able to speed up reads by 
over 2x 
 simply by rewriting the Parquet files with Offset Indexes and no compression 
(see <a 
href="https://github.com/apache/datafusion/issues/16149#issuecomment-2918761743";>issue
 #16149 comment</a> for details).
 There is likely significant additional performance available by using Bloom 
Filters and resorting the data
 to be clustered in a more optimal way for the queries.</p>
diff --git a/blog/feeds/all-en.atom.xml b/blog/feeds/all-en.atom.xml
index 1261772..8f643f6 100644
--- a/blog/feeds/all-en.atom.xml
+++ b/blog/feeds/all-en.atom.xml
@@ -567,6 +567,9 @@ components, rather than as a single tightly integrated 
system.&lt;/p&gt;
 improve the project. If you are interested in learning more about how query
 execution works, help document or improve the DataFusion codebase, or just try
 it out, we would love for you to join us.&lt;/p&gt;
+&lt;h3&gt;Acknowledgements&lt;/h3&gt;
+&lt;p&gt;Thank you to &lt;a href="https://github.com/zhuqi-lucas"&gt;Qi 
Zhu&lt;/a&gt;, &lt;a href="https://github.com/adamreeve"&gt;Adam 
Reeve&lt;/a&gt;, &lt;a href="https://github.com/JigaoLuo"&gt;Jigao 
Luo&lt;/a&gt;, &lt;a href="https://github.com/comphead"&gt;Oleks V&lt;/a&gt;, 
&lt;a href="https://github.com/shehabgamin"&gt;Shehab Amin&lt;/a&gt;, &lt;a 
href="https://nuno-faria.github.io/"&gt;Nuno Faria&lt;/a&gt;
+and &lt;a href="https://github.com/Omega359"&gt;Bruce Ritchie&lt;/a&gt; for 
their insightful feedback on this blog post.&lt;/p&gt;
 &lt;h3&gt;Footnotes&lt;/h3&gt;
 &lt;p&gt;&lt;a id="footnote1"&gt;&lt;/a&gt;&lt;code&gt;1&lt;/code&gt;: This 
trend is described in more detail in the &lt;a 
href="https://www.influxdata.com/blog/flight-datafusion-arrow-parquet-fdap-architecture-influxdb/"&gt;FDAP
 Stack&lt;/a&gt; blog&lt;/p&gt;
 &lt;p&gt;&lt;a id="footnote2"&gt;&lt;/a&gt;&lt;code&gt;2&lt;/code&gt;: This 
layout is referred to as &lt;a 
href="https://www.vldb.org/conf/2001/P169.pdf"&gt;PAX in the
@@ -581,7 +584,7 @@ with additional engineering effort (see &lt;a 
href="https://xiangpeng.systems/"&;
 topic&lt;/a&gt;). &lt;a href="https://github.com/etseidl"&gt;Ed 
Seidl&lt;/a&gt; is beginning this effort. See the &lt;a 
href="https://github.com/apache/arrow-rs/issues/5854"&gt;ticket&lt;/a&gt; for 
details.&lt;/p&gt;
 &lt;p&gt;&lt;a id="footnote6"&gt;&lt;/a&gt;&lt;code&gt;6&lt;/code&gt;: 
ClickBench includes a wide variety of query patterns
 such as point lookups, filters of different selectivity, and 
aggregations.&lt;/p&gt;
-&lt;p&gt;&lt;a id="footnote7"&gt;&lt;/a&gt;&lt;code&gt;7&lt;/code&gt;: For 
example, &lt;a href="https://github.com/zhuqi-lucas"&gt;Zhu Qi&lt;/a&gt; was 
able to speed up reads by over 2x 
+&lt;p&gt;&lt;a id="footnote7"&gt;&lt;/a&gt;&lt;code&gt;7&lt;/code&gt;: For 
example, &lt;a href="https://github.com/zhuqi-lucas"&gt;Qi Zhu&lt;/a&gt; was 
able to speed up reads by over 2x 
 simply by rewriting the Parquet files with Offset Indexes and no compression 
(see &lt;a 
href="https://github.com/apache/datafusion/issues/16149#issuecomment-2918761743"&gt;issue
 #16149 comment&lt;/a&gt; for details).
 There is likely significant additional performance available by using Bloom 
Filters and resorting the data
 to be clustered in a more optimal way for the 
queries.&lt;/p&gt;</content><category 
term="blog"></category></entry><entry><title>Apache DataFusion 49.0.0 
Released</title><link 
href="https://datafusion.apache.org/blog/2025/07/28/datafusion-49.0.0"; 
rel="alternate"></link><published>2025-07-28T00:00:00+00:00</published><updated>2025-07-28T00:00:00+00:00</updated><author><name>pmc</name></author><id>tag:datafusion.apache.org,2025-07-28:/blog/2025/07/28/datafusion-49.0.0</id><summary
 type="ht [...]
diff --git a/blog/feeds/andrew-lamb-influxdata.atom.xml 
b/blog/feeds/andrew-lamb-influxdata.atom.xml
index 77d036c..b1b2365 100644
--- a/blog/feeds/andrew-lamb-influxdata.atom.xml
+++ b/blog/feeds/andrew-lamb-influxdata.atom.xml
@@ -567,6 +567,9 @@ components, rather than as a single tightly integrated 
system.&lt;/p&gt;
 improve the project. If you are interested in learning more about how query
 execution works, help document or improve the DataFusion codebase, or just try
 it out, we would love for you to join us.&lt;/p&gt;
+&lt;h3&gt;Acknowledgements&lt;/h3&gt;
+&lt;p&gt;Thank you to &lt;a href="https://github.com/zhuqi-lucas"&gt;Qi 
Zhu&lt;/a&gt;, &lt;a href="https://github.com/adamreeve"&gt;Adam 
Reeve&lt;/a&gt;, &lt;a href="https://github.com/JigaoLuo"&gt;Jigao 
Luo&lt;/a&gt;, &lt;a href="https://github.com/comphead"&gt;Oleks V&lt;/a&gt;, 
&lt;a href="https://github.com/shehabgamin"&gt;Shehab Amin&lt;/a&gt;, &lt;a 
href="https://nuno-faria.github.io/"&gt;Nuno Faria&lt;/a&gt;
+and &lt;a href="https://github.com/Omega359"&gt;Bruce Ritchie&lt;/a&gt; for 
their insightful feedback on this blog post.&lt;/p&gt;
 &lt;h3&gt;Footnotes&lt;/h3&gt;
 &lt;p&gt;&lt;a id="footnote1"&gt;&lt;/a&gt;&lt;code&gt;1&lt;/code&gt;: This 
trend is described in more detail in the &lt;a 
href="https://www.influxdata.com/blog/flight-datafusion-arrow-parquet-fdap-architecture-influxdb/"&gt;FDAP
 Stack&lt;/a&gt; blog&lt;/p&gt;
 &lt;p&gt;&lt;a id="footnote2"&gt;&lt;/a&gt;&lt;code&gt;2&lt;/code&gt;: This 
layout is referred to as &lt;a 
href="https://www.vldb.org/conf/2001/P169.pdf"&gt;PAX in the
@@ -581,7 +584,7 @@ with additional engineering effort (see &lt;a 
href="https://xiangpeng.systems/"&;
 topic&lt;/a&gt;). &lt;a href="https://github.com/etseidl"&gt;Ed 
Seidl&lt;/a&gt; is beginning this effort. See the &lt;a 
href="https://github.com/apache/arrow-rs/issues/5854"&gt;ticket&lt;/a&gt; for 
details.&lt;/p&gt;
 &lt;p&gt;&lt;a id="footnote6"&gt;&lt;/a&gt;&lt;code&gt;6&lt;/code&gt;: 
ClickBench includes a wide variety of query patterns
 such as point lookups, filters of different selectivity, and 
aggregations.&lt;/p&gt;
-&lt;p&gt;&lt;a id="footnote7"&gt;&lt;/a&gt;&lt;code&gt;7&lt;/code&gt;: For 
example, &lt;a href="https://github.com/zhuqi-lucas"&gt;Zhu Qi&lt;/a&gt; was 
able to speed up reads by over 2x 
+&lt;p&gt;&lt;a id="footnote7"&gt;&lt;/a&gt;&lt;code&gt;7&lt;/code&gt;: For 
example, &lt;a href="https://github.com/zhuqi-lucas"&gt;Qi Zhu&lt;/a&gt; was 
able to speed up reads by over 2x 
 simply by rewriting the Parquet files with Offset Indexes and no compression 
(see &lt;a 
href="https://github.com/apache/datafusion/issues/16149#issuecomment-2918761743"&gt;issue
 #16149 comment&lt;/a&gt; for details).
 There is likely significant additional performance available by using Bloom 
Filters and resorting the data
 to be clustered in a more optimal way for the 
queries.&lt;/p&gt;</content><category term="blog"></category></entry></feed>
\ No newline at end of file
diff --git a/blog/feeds/blog.atom.xml b/blog/feeds/blog.atom.xml
index 69d53ea..86d5433 100644
--- a/blog/feeds/blog.atom.xml
+++ b/blog/feeds/blog.atom.xml
@@ -567,6 +567,9 @@ components, rather than as a single tightly integrated 
system.&lt;/p&gt;
 improve the project. If you are interested in learning more about how query
 execution works, help document or improve the DataFusion codebase, or just try
 it out, we would love for you to join us.&lt;/p&gt;
+&lt;h3&gt;Acknowledgements&lt;/h3&gt;
+&lt;p&gt;Thank you to &lt;a href="https://github.com/zhuqi-lucas"&gt;Qi 
Zhu&lt;/a&gt;, &lt;a href="https://github.com/adamreeve"&gt;Adam 
Reeve&lt;/a&gt;, &lt;a href="https://github.com/JigaoLuo"&gt;Jigao 
Luo&lt;/a&gt;, &lt;a href="https://github.com/comphead"&gt;Oleks V&lt;/a&gt;, 
&lt;a href="https://github.com/shehabgamin"&gt;Shehab Amin&lt;/a&gt;, &lt;a 
href="https://nuno-faria.github.io/"&gt;Nuno Faria&lt;/a&gt;
+and &lt;a href="https://github.com/Omega359"&gt;Bruce Ritchie&lt;/a&gt; for 
their insightful feedback on this blog post.&lt;/p&gt;
 &lt;h3&gt;Footnotes&lt;/h3&gt;
 &lt;p&gt;&lt;a id="footnote1"&gt;&lt;/a&gt;&lt;code&gt;1&lt;/code&gt;: This 
trend is described in more detail in the &lt;a 
href="https://www.influxdata.com/blog/flight-datafusion-arrow-parquet-fdap-architecture-influxdb/"&gt;FDAP
 Stack&lt;/a&gt; blog&lt;/p&gt;
 &lt;p&gt;&lt;a id="footnote2"&gt;&lt;/a&gt;&lt;code&gt;2&lt;/code&gt;: This 
layout is referred to as &lt;a 
href="https://www.vldb.org/conf/2001/P169.pdf"&gt;PAX in the
@@ -581,7 +584,7 @@ with additional engineering effort (see &lt;a 
href="https://xiangpeng.systems/"&;
 topic&lt;/a&gt;). &lt;a href="https://github.com/etseidl"&gt;Ed 
Seidl&lt;/a&gt; is beginning this effort. See the &lt;a 
href="https://github.com/apache/arrow-rs/issues/5854"&gt;ticket&lt;/a&gt; for 
details.&lt;/p&gt;
 &lt;p&gt;&lt;a id="footnote6"&gt;&lt;/a&gt;&lt;code&gt;6&lt;/code&gt;: 
ClickBench includes a wide variety of query patterns
 such as point lookups, filters of different selectivity, and 
aggregations.&lt;/p&gt;
-&lt;p&gt;&lt;a id="footnote7"&gt;&lt;/a&gt;&lt;code&gt;7&lt;/code&gt;: For 
example, &lt;a href="https://github.com/zhuqi-lucas"&gt;Zhu Qi&lt;/a&gt; was 
able to speed up reads by over 2x 
+&lt;p&gt;&lt;a id="footnote7"&gt;&lt;/a&gt;&lt;code&gt;7&lt;/code&gt;: For 
example, &lt;a href="https://github.com/zhuqi-lucas"&gt;Qi Zhu&lt;/a&gt; was 
able to speed up reads by over 2x 
 simply by rewriting the Parquet files with Offset Indexes and no compression 
(see &lt;a 
href="https://github.com/apache/datafusion/issues/16149#issuecomment-2918761743"&gt;issue
 #16149 comment&lt;/a&gt; for details).
 There is likely significant additional performance available by using Bloom 
Filters and resorting the data
 to be clustered in a more optimal way for the 
queries.&lt;/p&gt;</content><category 
term="blog"></category></entry><entry><title>Apache DataFusion 49.0.0 
Released</title><link 
href="https://datafusion.apache.org/blog/2025/07/28/datafusion-49.0.0"; 
rel="alternate"></link><published>2025-07-28T00:00:00+00:00</published><updated>2025-07-28T00:00:00+00:00</updated><author><name>pmc</name></author><id>tag:datafusion.apache.org,2025-07-28:/blog/2025/07/28/datafusion-49.0.0</id><summary
 type="ht [...]


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@datafusion.apache.org
For additional commands, e-mail: commits-h...@datafusion.apache.org

(datafusion-site) branch asf-staging updated: Commit build products

Reply via email to