(datafusion-site) branch asf-site updated: Commit build products

github-bot Tue, 15 Jul 2025 03:48:39 -0700

This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datafusion-site.git



The following commit(s) were added to refs/heads/asf-site by this push:
     new 92e711d  Commit build products
92e711d is described below

commit 92e711d6f862ebaa09acd9afa6ac20205d861089
Author: Build Pelican (action) <[email protected]>
AuthorDate: Tue Jul 15 10:48:28 2025 +0000

    Commit build products
---
 output/2025/07/14/user-defined-parquet-indexes/index.html     | 11 ++++++++++-
 ...ems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.html} |  6 +++---
 output/category/blog.html                                     |  2 +-
 output/feed.xml                                               |  2 +-
 output/feeds/all-en.atom.xml                                  | 11 ++++++++++-
 output/feeds/blog.atom.xml                                    | 11 ++++++++++-
 ...group-at-tu-darmstadt-and-andrew-lamb-influxdata.atom.xml} | 11 ++++++++++-
 ...-group-at-tu-darmstadt-and-andrew-lamb-influxdata.rss.xml} |  4 ++--
 output/index.html                                             |  2 +-
 9 files changed, 48 insertions(+), 12 deletions(-)

diff --git a/output/2025/07/14/user-defined-parquet-indexes/index.html 
b/output/2025/07/14/user-defined-parquet-indexes/index.html
index 0684735..9b74b62 100644
--- a/output/2025/07/14/user-defined-parquet-indexes/index.html
+++ b/output/2025/07/14/user-defined-parquet-indexes/index.html
@@ -42,7 +42,7 @@
           <h1>
             Embedding User-Defined Indexes in Apache Parquet Files
           </h1>
-            <p>Posted on: Mon 14 July 2025 by Qi Zhu, Jigao Luo, and Andrew 
Lamb</p>
+            <p>Posted on: Mon 14 July 2025 by Qi Zhu (Cloudera), Jigao Luo 
(Systems Group at TU Darmstadt), and Andrew Lamb (InfluxData)</p>
             <!--
 {% comment %}
 Licensed to the Apache Software Foundation (ASF) under one or more
@@ -488,6 +488,15 @@ df.show().await?;
 <p>In this post, we explained how index structures are stored in Apache 
Parquet, how to embed user-defined indexes without changing the format, and how 
to use user-defined indexes to speed up query processing.</p>
 <p>Parquet-based systems can achieve significant performance improvements for 
almost any query pattern while still retaining broad compatibility, using 
user-defined embedded indexes, external indexes<sup><a 
href="#footnote4">4</a></sup> and rewriting files optimized for specific 
queries<sup><a href="#footnote5">5</a></sup>. System designers can choose among 
the available options to make the appropriate trade-offs between operational 
complexity, performance, file size, and cost for their  [...]
 <p>We hope this post inspires you to explore custom indexes in Parquet files, 
rather than proposing new file formats and reimplementing existing features. 
The DataFusion community is excited to see how you use this feature in your 
projects!</p>
+<h2>About the Authors</h2>
+<p><a href="https://www.linkedin.com/in/qi-zhu-862330119/";>Qi Zhu</a> is a 
Senior Engineer at <a href="https://www.cloudera.com/";>Cloudera</a>, an active 
contributor to <a href="https://datafusion.apache.org/";>Apache DataFusion</a> 
and <a href="https://arrow.apache.org/";>Apache Arrow</a>, a committer on <a 
href="https://hadoop.apache.org/";>Apache Hadoop</a> and <a 
href="https://yunikorn.apache.org/";>Apache YuniKorn</a>. He has extensive 
experience in distributed systems, scheduling, and  [...]
+<p><a href="https://www.linkedin.com/in/jigao-luo/";>Jigao Luo</a> is a 
1.5-year PhD student at
+<a href="https://tuda.systems";>Systems Group @ TU Darmstadt</a>. Regarding 
Parquet, he is an external 
+contributor to <a href="https://github.com/rapidsai/cudf";>NVIDIA RAPIDS 
cuDF</a>, focusing on the GPU Parquet reader.</p>
+<p><a href="https://www.linkedin.com/in/andrewalamb/";>Andrew Lamb</a> is a 
Staff Engineer at
+<a href="https://www.influxdata.com/";>InfluxData</a>, and a member of the <a 
href="https://datafusion.apache.org/";>Apache
+DataFusion</a> and <a href="https://arrow.apache.org/";>Apache Arrow</a> PMCs. 
He has been working on
+Databases and related systems more than 20 years.</p>
 <h2>About DataFusion</h2>
 <p><a href="https://datafusion.apache.org/";>Apache DataFusion</a> is an 
extensible query engine toolkit, written
 in Rust, that uses <a href="https://arrow.apache.org/";>Apache Arrow</a> as its 
in-memory format. DataFusion and
diff --git a/output/author/qi-zhu-jigao-luo-and-andrew-lamb.html 
b/output/author/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.html
similarity index 87%
rename from output/author/qi-zhu-jigao-luo-and-andrew-lamb.html
rename to 
output/author/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.html
index 9b4fcf2..1bd4902 100644
--- a/output/author/qi-zhu-jigao-luo-and-andrew-lamb.html
+++ 
b/output/author/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.html
@@ -1,7 +1,7 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
-        <title>Apache DataFusion Blog - Articles by Qi Zhu, Jigao Luo, and 
Andrew Lamb</title>
+        <title>Apache DataFusion Blog - Articles by Qi Zhu (Cloudera), Jigao 
Luo (Systems Group at TU Darmstadt), and Andrew Lamb (InfluxData)</title>
         <meta charset="utf-8" />
         <meta name="generator" content="Pelican" />
         <link href="https://datafusion.apache.org/blog/feed.xml"; 
type="application/rss+xml" rel="alternate" title="Apache DataFusion Blog RSS 
Feed" />
@@ -17,7 +17,7 @@
             <li><a 
href="https://datafusion.apache.org/blog/category/blog.html";>blog</a></li>
         </ul></nav><!-- /#menu -->
 <section id="content">
-<h2>Articles by Qi Zhu, Jigao Luo, and Andrew Lamb</h2>
+<h2>Articles by Qi Zhu (Cloudera), Jigao Luo (Systems Group at TU Darmstadt), 
and Andrew Lamb (InfluxData)</h2>
 
 <ol id="post-list">
         <li><article class="hentry">
@@ -25,7 +25,7 @@
                 <footer class="post-info">
                     <time class="published" 
datetime="2025-07-14T00:00:00+00:00"> Mon 14 July 2025 </time>
                     <address class="vcard author">By
-                        <a class="url fn" 
href="https://datafusion.apache.org/blog/author/qi-zhu-jigao-luo-and-andrew-lamb.html";>Qi
 Zhu, Jigao Luo, and Andrew Lamb</a>
+                        <a class="url fn" 
href="https://datafusion.apache.org/blog/author/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.html";>Qi
 Zhu (Cloudera), Jigao Luo (Systems Group at TU Darmstadt), and Andrew Lamb 
(InfluxData)</a>
                     </address>
                 </footer><!-- /.post-info -->
                 <div class="entry-content"> <!--
diff --git a/output/category/blog.html b/output/category/blog.html
index abf197e..959bcb2 100644
--- a/output/category/blog.html
+++ b/output/category/blog.html
@@ -26,7 +26,7 @@
                 <footer class="post-info">
                     <time class="published" 
datetime="2025-07-14T00:00:00+00:00"> Mon 14 July 2025 </time>
                     <address class="vcard author">By
-                        <a class="url fn" 
href="https://datafusion.apache.org/blog/author/qi-zhu-jigao-luo-and-andrew-lamb.html";>Qi
 Zhu, Jigao Luo, and Andrew Lamb</a>
+                        <a class="url fn" 
href="https://datafusion.apache.org/blog/author/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.html";>Qi
 Zhu (Cloudera), Jigao Luo (Systems Group at TU Darmstadt), and Andrew Lamb 
(InfluxData)</a>
                     </address>
                 </footer><!-- /.post-info -->
                 <div class="entry-content"> <!--
diff --git a/output/feed.xml b/output/feed.xml
index 3da6133..5b2c40d 100644
--- a/output/feed.xml
+++ b/output/feed.xml
@@ -17,7 +17,7 @@ See the License for the specific language governing 
permissions and
 limitations under the License.
 {% endcomment %}
 --&gt;
-&lt;p&gt;It&amp;rsquo;s a common misconception that &lt;a 
href="https://parquet.apache.org/"&gt;Apache Parquet&lt;/a&gt; files are 
limited to basic Min/Max/Null Count statistics and Bloom filters, and that 
adding more advanced indexes requires changing the specification or creating a 
new file format. In fact, footer metadata and offset-based addressing already 
provide everything needed to embed …&lt;/p&gt;</description><dc:creator 
xmlns:dc="http://purl.org/dc/elements/1.1/";>Qi Zhu, Jigao [...]
+&lt;p&gt;It&amp;rsquo;s a common misconception that &lt;a 
href="https://parquet.apache.org/"&gt;Apache Parquet&lt;/a&gt; files are 
limited to basic Min/Max/Null Count statistics and Bloom filters, and that 
adding more advanced indexes requires changing the specification or creating a 
new file format. In fact, footer metadata and offset-based addressing already 
provide everything needed to embed …&lt;/p&gt;</description><dc:creator 
xmlns:dc="http://purl.org/dc/elements/1.1/";>Qi Zhu (Cloud [...]
 {% comment %}
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
diff --git a/output/feeds/all-en.atom.xml b/output/feeds/all-en.atom.xml
index 8e2768b..7a2beb8 100644
--- a/output/feeds/all-en.atom.xml
+++ b/output/feeds/all-en.atom.xml
@@ -1,5 +1,5 @@
 <?xml version="1.0" encoding="utf-8"?>
-<feed xmlns="http://www.w3.org/2005/Atom";><title>Apache DataFusion 
Blog</title><link href="https://datafusion.apache.org/blog/"; 
rel="alternate"></link><link 
href="https://datafusion.apache.org/blog/feeds/all-en.atom.xml"; 
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-07-14T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Embedding
 User-Defined Indexes in Apache Parquet Files</title><link 
href="https://datafusion.apache.org/blog/2025/07/14/user-defin [...]
+<feed xmlns="http://www.w3.org/2005/Atom";><title>Apache DataFusion 
Blog</title><link href="https://datafusion.apache.org/blog/"; 
rel="alternate"></link><link 
href="https://datafusion.apache.org/blog/feeds/all-en.atom.xml"; 
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-07-14T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Embedding
 User-Defined Indexes in Apache Parquet Files</title><link 
href="https://datafusion.apache.org/blog/2025/07/14/user-defin [...]
 {% comment %}
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
@@ -462,6 +462,15 @@ df.show().await?;
 &lt;p&gt;In this post, we explained how index structures are stored in Apache 
Parquet, how to embed user-defined indexes without changing the format, and how 
to use user-defined indexes to speed up query processing.&lt;/p&gt;
 &lt;p&gt;Parquet-based systems can achieve significant performance 
improvements for almost any query pattern while still retaining broad 
compatibility, using user-defined embedded indexes, external 
indexes&lt;sup&gt;&lt;a href="#footnote4"&gt;4&lt;/a&gt;&lt;/sup&gt; and 
rewriting files optimized for specific queries&lt;sup&gt;&lt;a 
href="#footnote5"&gt;5&lt;/a&gt;&lt;/sup&gt;. System designers can choose among 
the available options to make the appropriate trade-offs between operational c 
[...]
 &lt;p&gt;We hope this post inspires you to explore custom indexes in Parquet 
files, rather than proposing new file formats and reimplementing existing 
features. The DataFusion community is excited to see how you use this feature 
in your projects!&lt;/p&gt;
+&lt;h2&gt;About the Authors&lt;/h2&gt;
+&lt;p&gt;&lt;a href="https://www.linkedin.com/in/qi-zhu-862330119/"&gt;Qi 
Zhu&lt;/a&gt; is a Senior Engineer at &lt;a 
href="https://www.cloudera.com/"&gt;Cloudera&lt;/a&gt;, an active contributor 
to &lt;a href="https://datafusion.apache.org/"&gt;Apache DataFusion&lt;/a&gt; 
and &lt;a href="https://arrow.apache.org/"&gt;Apache Arrow&lt;/a&gt;, a 
committer on &lt;a href="https://hadoop.apache.org/"&gt;Apache Hadoop&lt;/a&gt; 
and &lt;a href="https://yunikorn.apache.org/"&gt;Apache YuniKorn&l [...]
+&lt;p&gt;&lt;a href="https://www.linkedin.com/in/jigao-luo/"&gt;Jigao 
Luo&lt;/a&gt; is a 1.5-year PhD student at
+&lt;a href="https://tuda.systems"&gt;Systems Group @ TU Darmstadt&lt;/a&gt;. 
Regarding Parquet, he is an external 
+contributor to &lt;a href="https://github.com/rapidsai/cudf"&gt;NVIDIA RAPIDS 
cuDF&lt;/a&gt;, focusing on the GPU Parquet reader.&lt;/p&gt;
+&lt;p&gt;&lt;a href="https://www.linkedin.com/in/andrewalamb/"&gt;Andrew 
Lamb&lt;/a&gt; is a Staff Engineer at
+&lt;a href="https://www.influxdata.com/"&gt;InfluxData&lt;/a&gt;, and a member 
of the &lt;a href="https://datafusion.apache.org/"&gt;Apache
+DataFusion&lt;/a&gt; and &lt;a href="https://arrow.apache.org/"&gt;Apache 
Arrow&lt;/a&gt; PMCs. He has been working on
+Databases and related systems more than 20 years.&lt;/p&gt;
 &lt;h2&gt;About DataFusion&lt;/h2&gt;
 &lt;p&gt;&lt;a href="https://datafusion.apache.org/"&gt;Apache 
DataFusion&lt;/a&gt; is an extensible query engine toolkit, written
 in Rust, that uses &lt;a href="https://arrow.apache.org/"&gt;Apache 
Arrow&lt;/a&gt; as its in-memory format. DataFusion and
diff --git a/output/feeds/blog.atom.xml b/output/feeds/blog.atom.xml
index 85737d7..73f895b 100644
--- a/output/feeds/blog.atom.xml
+++ b/output/feeds/blog.atom.xml
@@ -1,5 +1,5 @@
 <?xml version="1.0" encoding="utf-8"?>
-<feed xmlns="http://www.w3.org/2005/Atom";><title>Apache DataFusion Blog - 
blog</title><link href="https://datafusion.apache.org/blog/"; 
rel="alternate"></link><link 
href="https://datafusion.apache.org/blog/feeds/blog.atom.xml"; 
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-07-14T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Embedding
 User-Defined Indexes in Apache Parquet Files</title><link 
href="https://datafusion.apache.org/blog/2025/07/14/user- [...]
+<feed xmlns="http://www.w3.org/2005/Atom";><title>Apache DataFusion Blog - 
blog</title><link href="https://datafusion.apache.org/blog/"; 
rel="alternate"></link><link 
href="https://datafusion.apache.org/blog/feeds/blog.atom.xml"; 
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-07-14T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Embedding
 User-Defined Indexes in Apache Parquet Files</title><link 
href="https://datafusion.apache.org/blog/2025/07/14/user- [...]
 {% comment %}
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
@@ -462,6 +462,15 @@ df.show().await?;
 &lt;p&gt;In this post, we explained how index structures are stored in Apache 
Parquet, how to embed user-defined indexes without changing the format, and how 
to use user-defined indexes to speed up query processing.&lt;/p&gt;
 &lt;p&gt;Parquet-based systems can achieve significant performance 
improvements for almost any query pattern while still retaining broad 
compatibility, using user-defined embedded indexes, external 
indexes&lt;sup&gt;&lt;a href="#footnote4"&gt;4&lt;/a&gt;&lt;/sup&gt; and 
rewriting files optimized for specific queries&lt;sup&gt;&lt;a 
href="#footnote5"&gt;5&lt;/a&gt;&lt;/sup&gt;. System designers can choose among 
the available options to make the appropriate trade-offs between operational c 
[...]
 &lt;p&gt;We hope this post inspires you to explore custom indexes in Parquet 
files, rather than proposing new file formats and reimplementing existing 
features. The DataFusion community is excited to see how you use this feature 
in your projects!&lt;/p&gt;
+&lt;h2&gt;About the Authors&lt;/h2&gt;
+&lt;p&gt;&lt;a href="https://www.linkedin.com/in/qi-zhu-862330119/"&gt;Qi 
Zhu&lt;/a&gt; is a Senior Engineer at &lt;a 
href="https://www.cloudera.com/"&gt;Cloudera&lt;/a&gt;, an active contributor 
to &lt;a href="https://datafusion.apache.org/"&gt;Apache DataFusion&lt;/a&gt; 
and &lt;a href="https://arrow.apache.org/"&gt;Apache Arrow&lt;/a&gt;, a 
committer on &lt;a href="https://hadoop.apache.org/"&gt;Apache Hadoop&lt;/a&gt; 
and &lt;a href="https://yunikorn.apache.org/"&gt;Apache YuniKorn&l [...]
+&lt;p&gt;&lt;a href="https://www.linkedin.com/in/jigao-luo/"&gt;Jigao 
Luo&lt;/a&gt; is a 1.5-year PhD student at
+&lt;a href="https://tuda.systems"&gt;Systems Group @ TU Darmstadt&lt;/a&gt;. 
Regarding Parquet, he is an external 
+contributor to &lt;a href="https://github.com/rapidsai/cudf"&gt;NVIDIA RAPIDS 
cuDF&lt;/a&gt;, focusing on the GPU Parquet reader.&lt;/p&gt;
+&lt;p&gt;&lt;a href="https://www.linkedin.com/in/andrewalamb/"&gt;Andrew 
Lamb&lt;/a&gt; is a Staff Engineer at
+&lt;a href="https://www.influxdata.com/"&gt;InfluxData&lt;/a&gt;, and a member 
of the &lt;a href="https://datafusion.apache.org/"&gt;Apache
+DataFusion&lt;/a&gt; and &lt;a href="https://arrow.apache.org/"&gt;Apache 
Arrow&lt;/a&gt; PMCs. He has been working on
+Databases and related systems more than 20 years.&lt;/p&gt;
 &lt;h2&gt;About DataFusion&lt;/h2&gt;
 &lt;p&gt;&lt;a href="https://datafusion.apache.org/"&gt;Apache 
DataFusion&lt;/a&gt; is an extensible query engine toolkit, written
 in Rust, that uses &lt;a href="https://arrow.apache.org/"&gt;Apache 
Arrow&lt;/a&gt; as its in-memory format. DataFusion and
diff --git a/output/feeds/qi-zhu-jigao-luo-and-andrew-lamb.atom.xml 
b/output/feeds/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.atom.xml
similarity index 94%
rename from output/feeds/qi-zhu-jigao-luo-and-andrew-lamb.atom.xml
rename to 
output/feeds/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.atom.xml
index 9247742..911bfd1 100644
--- a/output/feeds/qi-zhu-jigao-luo-and-andrew-lamb.atom.xml
+++ 
b/output/feeds/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.atom.xml
@@ -1,5 +1,5 @@
 <?xml version="1.0" encoding="utf-8"?>
-<feed xmlns="http://www.w3.org/2005/Atom";><title>Apache DataFusion Blog - Qi 
Zhu, Jigao Luo, and Andrew Lamb</title><link 
href="https://datafusion.apache.org/blog/"; rel="alternate"></link><link 
href="https://datafusion.apache.org/blog/feeds/qi-zhu-jigao-luo-and-andrew-lamb.atom.xml";
 
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-07-14T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Embedding
 User-Defined Indexes in Apache Parquet Files</title><link [...]
+<feed xmlns="http://www.w3.org/2005/Atom";><title>Apache DataFusion Blog - Qi 
Zhu (Cloudera), Jigao Luo (Systems Group at TU Darmstadt), and Andrew Lamb 
(InfluxData)</title><link href="https://datafusion.apache.org/blog/"; 
rel="alternate"></link><link 
href="https://datafusion.apache.org/blog/feeds/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.atom.xml";
 
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-07-14T00:00:00+00:00</upda
 [...]
 {% comment %}
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
@@ -462,6 +462,15 @@ df.show().await?;
 &lt;p&gt;In this post, we explained how index structures are stored in Apache 
Parquet, how to embed user-defined indexes without changing the format, and how 
to use user-defined indexes to speed up query processing.&lt;/p&gt;
 &lt;p&gt;Parquet-based systems can achieve significant performance 
improvements for almost any query pattern while still retaining broad 
compatibility, using user-defined embedded indexes, external 
indexes&lt;sup&gt;&lt;a href="#footnote4"&gt;4&lt;/a&gt;&lt;/sup&gt; and 
rewriting files optimized for specific queries&lt;sup&gt;&lt;a 
href="#footnote5"&gt;5&lt;/a&gt;&lt;/sup&gt;. System designers can choose among 
the available options to make the appropriate trade-offs between operational c 
[...]
 &lt;p&gt;We hope this post inspires you to explore custom indexes in Parquet 
files, rather than proposing new file formats and reimplementing existing 
features. The DataFusion community is excited to see how you use this feature 
in your projects!&lt;/p&gt;
+&lt;h2&gt;About the Authors&lt;/h2&gt;
+&lt;p&gt;&lt;a href="https://www.linkedin.com/in/qi-zhu-862330119/"&gt;Qi 
Zhu&lt;/a&gt; is a Senior Engineer at &lt;a 
href="https://www.cloudera.com/"&gt;Cloudera&lt;/a&gt;, an active contributor 
to &lt;a href="https://datafusion.apache.org/"&gt;Apache DataFusion&lt;/a&gt; 
and &lt;a href="https://arrow.apache.org/"&gt;Apache Arrow&lt;/a&gt;, a 
committer on &lt;a href="https://hadoop.apache.org/"&gt;Apache Hadoop&lt;/a&gt; 
and &lt;a href="https://yunikorn.apache.org/"&gt;Apache YuniKorn&l [...]
+&lt;p&gt;&lt;a href="https://www.linkedin.com/in/jigao-luo/"&gt;Jigao 
Luo&lt;/a&gt; is a 1.5-year PhD student at
+&lt;a href="https://tuda.systems"&gt;Systems Group @ TU Darmstadt&lt;/a&gt;. 
Regarding Parquet, he is an external 
+contributor to &lt;a href="https://github.com/rapidsai/cudf"&gt;NVIDIA RAPIDS 
cuDF&lt;/a&gt;, focusing on the GPU Parquet reader.&lt;/p&gt;
+&lt;p&gt;&lt;a href="https://www.linkedin.com/in/andrewalamb/"&gt;Andrew 
Lamb&lt;/a&gt; is a Staff Engineer at
+&lt;a href="https://www.influxdata.com/"&gt;InfluxData&lt;/a&gt;, and a member 
of the &lt;a href="https://datafusion.apache.org/"&gt;Apache
+DataFusion&lt;/a&gt; and &lt;a href="https://arrow.apache.org/"&gt;Apache 
Arrow&lt;/a&gt; PMCs. He has been working on
+Databases and related systems more than 20 years.&lt;/p&gt;
 &lt;h2&gt;About DataFusion&lt;/h2&gt;
 &lt;p&gt;&lt;a href="https://datafusion.apache.org/"&gt;Apache 
DataFusion&lt;/a&gt; is an extensible query engine toolkit, written
 in Rust, that uses &lt;a href="https://arrow.apache.org/"&gt;Apache 
Arrow&lt;/a&gt; as its in-memory format. DataFusion and
diff --git a/output/feeds/qi-zhu-jigao-luo-and-andrew-lamb.rss.xml 
b/output/feeds/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.rss.xml
similarity index 63%
rename from output/feeds/qi-zhu-jigao-luo-and-andrew-lamb.rss.xml
rename to 
output/feeds/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.rss.xml
index 79c9684..cf7c435 100644
--- a/output/feeds/qi-zhu-jigao-luo-and-andrew-lamb.rss.xml
+++ 
b/output/feeds/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.rss.xml
@@ -1,5 +1,5 @@
 <?xml version="1.0" encoding="utf-8"?>
-<rss version="2.0"><channel><title>Apache DataFusion Blog - Qi Zhu, Jigao Luo, 
and Andrew 
Lamb</title><link>https://datafusion.apache.org/blog/</link><description></description><lastBuildDate>Mon,
 14 Jul 2025 00:00:00 +0000</lastBuildDate><item><title>Embedding User-Defined 
Indexes in Apache Parquet 
Files</title><link>https://datafusion.apache.org/blog/2025/07/14/user-defined-parquet-indexes</link><description>&lt;!--
+<rss version="2.0"><channel><title>Apache DataFusion Blog - Qi Zhu (Cloudera), 
Jigao Luo (Systems Group at TU Darmstadt), and Andrew Lamb 
(InfluxData)</title><link>https://datafusion.apache.org/blog/</link><description></description><lastBuildDate>Mon,
 14 Jul 2025 00:00:00 +0000</lastBuildDate><item><title>Embedding User-Defined 
Indexes in Apache Parquet 
Files</title><link>https://datafusion.apache.org/blog/2025/07/14/user-defined-parquet-indexes</link><description>&lt;!--
 {% comment %}
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
@@ -17,4 +17,4 @@ See the License for the specific language governing 
permissions and
 limitations under the License.
 {% endcomment %}
 --&gt;
-&lt;p&gt;It&amp;rsquo;s a common misconception that &lt;a 
href="https://parquet.apache.org/"&gt;Apache Parquet&lt;/a&gt; files are 
limited to basic Min/Max/Null Count statistics and Bloom filters, and that 
adding more advanced indexes requires changing the specification or creating a 
new file format. In fact, footer metadata and offset-based addressing already 
provide everything needed to embed …&lt;/p&gt;</description><dc:creator 
xmlns:dc="http://purl.org/dc/elements/1.1/";>Qi Zhu, Jigao [...]
\ No newline at end of file
+&lt;p&gt;It&amp;rsquo;s a common misconception that &lt;a 
href="https://parquet.apache.org/"&gt;Apache Parquet&lt;/a&gt; files are 
limited to basic Min/Max/Null Count statistics and Bloom filters, and that 
adding more advanced indexes requires changing the specification or creating a 
new file format. In fact, footer metadata and offset-based addressing already 
provide everything needed to embed …&lt;/p&gt;</description><dc:creator 
xmlns:dc="http://purl.org/dc/elements/1.1/";>Qi Zhu (Cloud [...]
\ No newline at end of file
diff --git a/output/index.html b/output/index.html
index 3165a1f..d156197 100644
--- a/output/index.html
+++ b/output/index.html
@@ -51,7 +51,7 @@
                 <header>
                     <div class="title">
                         <h1><a 
href="/blog/2025/07/14/user-defined-parquet-indexes">Embedding User-Defined 
Indexes in Apache Parquet Files</a></h1>
-                        <p>Posted on: Mon 14 July 2025 by Qi Zhu, Jigao Luo, 
and Andrew Lamb</p>
+                        <p>Posted on: Mon 14 July 2025 by Qi Zhu (Cloudera), 
Jigao Luo (Systems Group at TU Darmstadt), and Andrew Lamb (InfluxData)</p>
                         <p><!--
 {% comment %}
 Licensed to the Apache Software Foundation (ASF) under one or more


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(datafusion-site) branch asf-site updated: Commit build products

Reply via email to