This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datafusion-site.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 92e711d Commit build products
92e711d is described below
commit 92e711d6f862ebaa09acd9afa6ac20205d861089
Author: Build Pelican (action) <[email protected]>
AuthorDate: Tue Jul 15 10:48:28 2025 +0000
Commit build products
---
output/2025/07/14/user-defined-parquet-indexes/index.html | 11 ++++++++++-
...ems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.html} | 6 +++---
output/category/blog.html | 2 +-
output/feed.xml | 2 +-
output/feeds/all-en.atom.xml | 11 ++++++++++-
output/feeds/blog.atom.xml | 11 ++++++++++-
...group-at-tu-darmstadt-and-andrew-lamb-influxdata.atom.xml} | 11 ++++++++++-
...-group-at-tu-darmstadt-and-andrew-lamb-influxdata.rss.xml} | 4 ++--
output/index.html | 2 +-
9 files changed, 48 insertions(+), 12 deletions(-)
diff --git a/output/2025/07/14/user-defined-parquet-indexes/index.html
b/output/2025/07/14/user-defined-parquet-indexes/index.html
index 0684735..9b74b62 100644
--- a/output/2025/07/14/user-defined-parquet-indexes/index.html
+++ b/output/2025/07/14/user-defined-parquet-indexes/index.html
@@ -42,7 +42,7 @@
<h1>
Embedding User-Defined Indexes in Apache Parquet Files
</h1>
- <p>Posted on: Mon 14 July 2025 by Qi Zhu, Jigao Luo, and Andrew
Lamb</p>
+ <p>Posted on: Mon 14 July 2025 by Qi Zhu (Cloudera), Jigao Luo
(Systems Group at TU Darmstadt), and Andrew Lamb (InfluxData)</p>
<!--
{% comment %}
Licensed to the Apache Software Foundation (ASF) under one or more
@@ -488,6 +488,15 @@ df.show().await?;
<p>In this post, we explained how index structures are stored in Apache
Parquet, how to embed user-defined indexes without changing the format, and how
to use user-defined indexes to speed up query processing.</p>
<p>Parquet-based systems can achieve significant performance improvements for
almost any query pattern while still retaining broad compatibility, using
user-defined embedded indexes, external indexes<sup><a
href="#footnote4">4</a></sup> and rewriting files optimized for specific
queries<sup><a href="#footnote5">5</a></sup>. System designers can choose among
the available options to make the appropriate trade-offs between operational
complexity, performance, file size, and cost for their [...]
<p>We hope this post inspires you to explore custom indexes in Parquet files,
rather than proposing new file formats and reimplementing existing features.
The DataFusion community is excited to see how you use this feature in your
projects!</p>
+<h2>About the Authors</h2>
+<p><a href="https://www.linkedin.com/in/qi-zhu-862330119/">Qi Zhu</a> is a
Senior Engineer at <a href="https://www.cloudera.com/">Cloudera</a>, an active
contributor to <a href="https://datafusion.apache.org/">Apache DataFusion</a>
and <a href="https://arrow.apache.org/">Apache Arrow</a>, a committer on <a
href="https://hadoop.apache.org/">Apache Hadoop</a> and <a
href="https://yunikorn.apache.org/">Apache YuniKorn</a>. He has extensive
experience in distributed systems, scheduling, and [...]
+<p><a href="https://www.linkedin.com/in/jigao-luo/">Jigao Luo</a> is a
1.5-year PhD student at
+<a href="https://tuda.systems">Systems Group @ TU Darmstadt</a>. Regarding
Parquet, he is an external
+contributor to <a href="https://github.com/rapidsai/cudf">NVIDIA RAPIDS
cuDF</a>, focusing on the GPU Parquet reader.</p>
+<p><a href="https://www.linkedin.com/in/andrewalamb/">Andrew Lamb</a> is a
Staff Engineer at
+<a href="https://www.influxdata.com/">InfluxData</a>, and a member of the <a
href="https://datafusion.apache.org/">Apache
+DataFusion</a> and <a href="https://arrow.apache.org/">Apache Arrow</a> PMCs.
He has been working on
+Databases and related systems more than 20 years.</p>
<h2>About DataFusion</h2>
<p><a href="https://datafusion.apache.org/">Apache DataFusion</a> is an
extensible query engine toolkit, written
in Rust, that uses <a href="https://arrow.apache.org/">Apache Arrow</a> as its
in-memory format. DataFusion and
diff --git a/output/author/qi-zhu-jigao-luo-and-andrew-lamb.html
b/output/author/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.html
similarity index 87%
rename from output/author/qi-zhu-jigao-luo-and-andrew-lamb.html
rename to
output/author/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.html
index 9b4fcf2..1bd4902 100644
--- a/output/author/qi-zhu-jigao-luo-and-andrew-lamb.html
+++
b/output/author/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.html
@@ -1,7 +1,7 @@
<!DOCTYPE html>
<html lang="en">
<head>
- <title>Apache DataFusion Blog - Articles by Qi Zhu, Jigao Luo, and
Andrew Lamb</title>
+ <title>Apache DataFusion Blog - Articles by Qi Zhu (Cloudera), Jigao
Luo (Systems Group at TU Darmstadt), and Andrew Lamb (InfluxData)</title>
<meta charset="utf-8" />
<meta name="generator" content="Pelican" />
<link href="https://datafusion.apache.org/blog/feed.xml"
type="application/rss+xml" rel="alternate" title="Apache DataFusion Blog RSS
Feed" />
@@ -17,7 +17,7 @@
<li><a
href="https://datafusion.apache.org/blog/category/blog.html">blog</a></li>
</ul></nav><!-- /#menu -->
<section id="content">
-<h2>Articles by Qi Zhu, Jigao Luo, and Andrew Lamb</h2>
+<h2>Articles by Qi Zhu (Cloudera), Jigao Luo (Systems Group at TU Darmstadt),
and Andrew Lamb (InfluxData)</h2>
<ol id="post-list">
<li><article class="hentry">
@@ -25,7 +25,7 @@
<footer class="post-info">
<time class="published"
datetime="2025-07-14T00:00:00+00:00"> Mon 14 July 2025 </time>
<address class="vcard author">By
- <a class="url fn"
href="https://datafusion.apache.org/blog/author/qi-zhu-jigao-luo-and-andrew-lamb.html">Qi
Zhu, Jigao Luo, and Andrew Lamb</a>
+ <a class="url fn"
href="https://datafusion.apache.org/blog/author/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.html">Qi
Zhu (Cloudera), Jigao Luo (Systems Group at TU Darmstadt), and Andrew Lamb
(InfluxData)</a>
</address>
</footer><!-- /.post-info -->
<div class="entry-content"> <!--
diff --git a/output/category/blog.html b/output/category/blog.html
index abf197e..959bcb2 100644
--- a/output/category/blog.html
+++ b/output/category/blog.html
@@ -26,7 +26,7 @@
<footer class="post-info">
<time class="published"
datetime="2025-07-14T00:00:00+00:00"> Mon 14 July 2025 </time>
<address class="vcard author">By
- <a class="url fn"
href="https://datafusion.apache.org/blog/author/qi-zhu-jigao-luo-and-andrew-lamb.html">Qi
Zhu, Jigao Luo, and Andrew Lamb</a>
+ <a class="url fn"
href="https://datafusion.apache.org/blog/author/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.html">Qi
Zhu (Cloudera), Jigao Luo (Systems Group at TU Darmstadt), and Andrew Lamb
(InfluxData)</a>
</address>
</footer><!-- /.post-info -->
<div class="entry-content"> <!--
diff --git a/output/feed.xml b/output/feed.xml
index 3da6133..5b2c40d 100644
--- a/output/feed.xml
+++ b/output/feed.xml
@@ -17,7 +17,7 @@ See the License for the specific language governing
permissions and
limitations under the License.
{% endcomment %}
-->
-<p>It&rsquo;s a common misconception that <a
href="https://parquet.apache.org/">Apache Parquet</a> files are
limited to basic Min/Max/Null Count statistics and Bloom filters, and that
adding more advanced indexes requires changing the specification or creating a
new file format. In fact, footer metadata and offset-based addressing already
provide everything needed to embed …</p></description><dc:creator
xmlns:dc="http://purl.org/dc/elements/1.1/">Qi Zhu, Jigao [...]
+<p>It&rsquo;s a common misconception that <a
href="https://parquet.apache.org/">Apache Parquet</a> files are
limited to basic Min/Max/Null Count statistics and Bloom filters, and that
adding more advanced indexes requires changing the specification or creating a
new file format. In fact, footer metadata and offset-based addressing already
provide everything needed to embed …</p></description><dc:creator
xmlns:dc="http://purl.org/dc/elements/1.1/">Qi Zhu (Cloud [...]
{% comment %}
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
diff --git a/output/feeds/all-en.atom.xml b/output/feeds/all-en.atom.xml
index 8e2768b..7a2beb8 100644
--- a/output/feeds/all-en.atom.xml
+++ b/output/feeds/all-en.atom.xml
@@ -1,5 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
-<feed xmlns="http://www.w3.org/2005/Atom"><title>Apache DataFusion
Blog</title><link href="https://datafusion.apache.org/blog/"
rel="alternate"></link><link
href="https://datafusion.apache.org/blog/feeds/all-en.atom.xml"
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-07-14T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Embedding
User-Defined Indexes in Apache Parquet Files</title><link
href="https://datafusion.apache.org/blog/2025/07/14/user-defin [...]
+<feed xmlns="http://www.w3.org/2005/Atom"><title>Apache DataFusion
Blog</title><link href="https://datafusion.apache.org/blog/"
rel="alternate"></link><link
href="https://datafusion.apache.org/blog/feeds/all-en.atom.xml"
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-07-14T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Embedding
User-Defined Indexes in Apache Parquet Files</title><link
href="https://datafusion.apache.org/blog/2025/07/14/user-defin [...]
{% comment %}
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
@@ -462,6 +462,15 @@ df.show().await?;
<p>In this post, we explained how index structures are stored in Apache
Parquet, how to embed user-defined indexes without changing the format, and how
to use user-defined indexes to speed up query processing.</p>
<p>Parquet-based systems can achieve significant performance
improvements for almost any query pattern while still retaining broad
compatibility, using user-defined embedded indexes, external
indexes<sup><a href="#footnote4">4</a></sup> and
rewriting files optimized for specific queries<sup><a
href="#footnote5">5</a></sup>. System designers can choose among
the available options to make the appropriate trade-offs between operational c
[...]
<p>We hope this post inspires you to explore custom indexes in Parquet
files, rather than proposing new file formats and reimplementing existing
features. The DataFusion community is excited to see how you use this feature
in your projects!</p>
+<h2>About the Authors</h2>
+<p><a href="https://www.linkedin.com/in/qi-zhu-862330119/">Qi
Zhu</a> is a Senior Engineer at <a
href="https://www.cloudera.com/">Cloudera</a>, an active contributor
to <a href="https://datafusion.apache.org/">Apache DataFusion</a>
and <a href="https://arrow.apache.org/">Apache Arrow</a>, a
committer on <a href="https://hadoop.apache.org/">Apache Hadoop</a>
and <a href="https://yunikorn.apache.org/">Apache YuniKorn&l [...]
+<p><a href="https://www.linkedin.com/in/jigao-luo/">Jigao
Luo</a> is a 1.5-year PhD student at
+<a href="https://tuda.systems">Systems Group @ TU Darmstadt</a>.
Regarding Parquet, he is an external
+contributor to <a href="https://github.com/rapidsai/cudf">NVIDIA RAPIDS
cuDF</a>, focusing on the GPU Parquet reader.</p>
+<p><a href="https://www.linkedin.com/in/andrewalamb/">Andrew
Lamb</a> is a Staff Engineer at
+<a href="https://www.influxdata.com/">InfluxData</a>, and a member
of the <a href="https://datafusion.apache.org/">Apache
+DataFusion</a> and <a href="https://arrow.apache.org/">Apache
Arrow</a> PMCs. He has been working on
+Databases and related systems more than 20 years.</p>
<h2>About DataFusion</h2>
<p><a href="https://datafusion.apache.org/">Apache
DataFusion</a> is an extensible query engine toolkit, written
in Rust, that uses <a href="https://arrow.apache.org/">Apache
Arrow</a> as its in-memory format. DataFusion and
diff --git a/output/feeds/blog.atom.xml b/output/feeds/blog.atom.xml
index 85737d7..73f895b 100644
--- a/output/feeds/blog.atom.xml
+++ b/output/feeds/blog.atom.xml
@@ -1,5 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
-<feed xmlns="http://www.w3.org/2005/Atom"><title>Apache DataFusion Blog -
blog</title><link href="https://datafusion.apache.org/blog/"
rel="alternate"></link><link
href="https://datafusion.apache.org/blog/feeds/blog.atom.xml"
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-07-14T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Embedding
User-Defined Indexes in Apache Parquet Files</title><link
href="https://datafusion.apache.org/blog/2025/07/14/user- [...]
+<feed xmlns="http://www.w3.org/2005/Atom"><title>Apache DataFusion Blog -
blog</title><link href="https://datafusion.apache.org/blog/"
rel="alternate"></link><link
href="https://datafusion.apache.org/blog/feeds/blog.atom.xml"
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-07-14T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Embedding
User-Defined Indexes in Apache Parquet Files</title><link
href="https://datafusion.apache.org/blog/2025/07/14/user- [...]
{% comment %}
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
@@ -462,6 +462,15 @@ df.show().await?;
<p>In this post, we explained how index structures are stored in Apache
Parquet, how to embed user-defined indexes without changing the format, and how
to use user-defined indexes to speed up query processing.</p>
<p>Parquet-based systems can achieve significant performance
improvements for almost any query pattern while still retaining broad
compatibility, using user-defined embedded indexes, external
indexes<sup><a href="#footnote4">4</a></sup> and
rewriting files optimized for specific queries<sup><a
href="#footnote5">5</a></sup>. System designers can choose among
the available options to make the appropriate trade-offs between operational c
[...]
<p>We hope this post inspires you to explore custom indexes in Parquet
files, rather than proposing new file formats and reimplementing existing
features. The DataFusion community is excited to see how you use this feature
in your projects!</p>
+<h2>About the Authors</h2>
+<p><a href="https://www.linkedin.com/in/qi-zhu-862330119/">Qi
Zhu</a> is a Senior Engineer at <a
href="https://www.cloudera.com/">Cloudera</a>, an active contributor
to <a href="https://datafusion.apache.org/">Apache DataFusion</a>
and <a href="https://arrow.apache.org/">Apache Arrow</a>, a
committer on <a href="https://hadoop.apache.org/">Apache Hadoop</a>
and <a href="https://yunikorn.apache.org/">Apache YuniKorn&l [...]
+<p><a href="https://www.linkedin.com/in/jigao-luo/">Jigao
Luo</a> is a 1.5-year PhD student at
+<a href="https://tuda.systems">Systems Group @ TU Darmstadt</a>.
Regarding Parquet, he is an external
+contributor to <a href="https://github.com/rapidsai/cudf">NVIDIA RAPIDS
cuDF</a>, focusing on the GPU Parquet reader.</p>
+<p><a href="https://www.linkedin.com/in/andrewalamb/">Andrew
Lamb</a> is a Staff Engineer at
+<a href="https://www.influxdata.com/">InfluxData</a>, and a member
of the <a href="https://datafusion.apache.org/">Apache
+DataFusion</a> and <a href="https://arrow.apache.org/">Apache
Arrow</a> PMCs. He has been working on
+Databases and related systems more than 20 years.</p>
<h2>About DataFusion</h2>
<p><a href="https://datafusion.apache.org/">Apache
DataFusion</a> is an extensible query engine toolkit, written
in Rust, that uses <a href="https://arrow.apache.org/">Apache
Arrow</a> as its in-memory format. DataFusion and
diff --git a/output/feeds/qi-zhu-jigao-luo-and-andrew-lamb.atom.xml
b/output/feeds/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.atom.xml
similarity index 94%
rename from output/feeds/qi-zhu-jigao-luo-and-andrew-lamb.atom.xml
rename to
output/feeds/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.atom.xml
index 9247742..911bfd1 100644
--- a/output/feeds/qi-zhu-jigao-luo-and-andrew-lamb.atom.xml
+++
b/output/feeds/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.atom.xml
@@ -1,5 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
-<feed xmlns="http://www.w3.org/2005/Atom"><title>Apache DataFusion Blog - Qi
Zhu, Jigao Luo, and Andrew Lamb</title><link
href="https://datafusion.apache.org/blog/" rel="alternate"></link><link
href="https://datafusion.apache.org/blog/feeds/qi-zhu-jigao-luo-and-andrew-lamb.atom.xml"
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-07-14T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Embedding
User-Defined Indexes in Apache Parquet Files</title><link [...]
+<feed xmlns="http://www.w3.org/2005/Atom"><title>Apache DataFusion Blog - Qi
Zhu (Cloudera), Jigao Luo (Systems Group at TU Darmstadt), and Andrew Lamb
(InfluxData)</title><link href="https://datafusion.apache.org/blog/"
rel="alternate"></link><link
href="https://datafusion.apache.org/blog/feeds/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.atom.xml"
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-07-14T00:00:00+00:00</upda
[...]
{% comment %}
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
@@ -462,6 +462,15 @@ df.show().await?;
<p>In this post, we explained how index structures are stored in Apache
Parquet, how to embed user-defined indexes without changing the format, and how
to use user-defined indexes to speed up query processing.</p>
<p>Parquet-based systems can achieve significant performance
improvements for almost any query pattern while still retaining broad
compatibility, using user-defined embedded indexes, external
indexes<sup><a href="#footnote4">4</a></sup> and
rewriting files optimized for specific queries<sup><a
href="#footnote5">5</a></sup>. System designers can choose among
the available options to make the appropriate trade-offs between operational c
[...]
<p>We hope this post inspires you to explore custom indexes in Parquet
files, rather than proposing new file formats and reimplementing existing
features. The DataFusion community is excited to see how you use this feature
in your projects!</p>
+<h2>About the Authors</h2>
+<p><a href="https://www.linkedin.com/in/qi-zhu-862330119/">Qi
Zhu</a> is a Senior Engineer at <a
href="https://www.cloudera.com/">Cloudera</a>, an active contributor
to <a href="https://datafusion.apache.org/">Apache DataFusion</a>
and <a href="https://arrow.apache.org/">Apache Arrow</a>, a
committer on <a href="https://hadoop.apache.org/">Apache Hadoop</a>
and <a href="https://yunikorn.apache.org/">Apache YuniKorn&l [...]
+<p><a href="https://www.linkedin.com/in/jigao-luo/">Jigao
Luo</a> is a 1.5-year PhD student at
+<a href="https://tuda.systems">Systems Group @ TU Darmstadt</a>.
Regarding Parquet, he is an external
+contributor to <a href="https://github.com/rapidsai/cudf">NVIDIA RAPIDS
cuDF</a>, focusing on the GPU Parquet reader.</p>
+<p><a href="https://www.linkedin.com/in/andrewalamb/">Andrew
Lamb</a> is a Staff Engineer at
+<a href="https://www.influxdata.com/">InfluxData</a>, and a member
of the <a href="https://datafusion.apache.org/">Apache
+DataFusion</a> and <a href="https://arrow.apache.org/">Apache
Arrow</a> PMCs. He has been working on
+Databases and related systems more than 20 years.</p>
<h2>About DataFusion</h2>
<p><a href="https://datafusion.apache.org/">Apache
DataFusion</a> is an extensible query engine toolkit, written
in Rust, that uses <a href="https://arrow.apache.org/">Apache
Arrow</a> as its in-memory format. DataFusion and
diff --git a/output/feeds/qi-zhu-jigao-luo-and-andrew-lamb.rss.xml
b/output/feeds/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.rss.xml
similarity index 63%
rename from output/feeds/qi-zhu-jigao-luo-and-andrew-lamb.rss.xml
rename to
output/feeds/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.rss.xml
index 79c9684..cf7c435 100644
--- a/output/feeds/qi-zhu-jigao-luo-and-andrew-lamb.rss.xml
+++
b/output/feeds/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.rss.xml
@@ -1,5 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
-<rss version="2.0"><channel><title>Apache DataFusion Blog - Qi Zhu, Jigao Luo,
and Andrew
Lamb</title><link>https://datafusion.apache.org/blog/</link><description></description><lastBuildDate>Mon,
14 Jul 2025 00:00:00 +0000</lastBuildDate><item><title>Embedding User-Defined
Indexes in Apache Parquet
Files</title><link>https://datafusion.apache.org/blog/2025/07/14/user-defined-parquet-indexes</link><description><!--
+<rss version="2.0"><channel><title>Apache DataFusion Blog - Qi Zhu (Cloudera),
Jigao Luo (Systems Group at TU Darmstadt), and Andrew Lamb
(InfluxData)</title><link>https://datafusion.apache.org/blog/</link><description></description><lastBuildDate>Mon,
14 Jul 2025 00:00:00 +0000</lastBuildDate><item><title>Embedding User-Defined
Indexes in Apache Parquet
Files</title><link>https://datafusion.apache.org/blog/2025/07/14/user-defined-parquet-indexes</link><description><!--
{% comment %}
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
@@ -17,4 +17,4 @@ See the License for the specific language governing
permissions and
limitations under the License.
{% endcomment %}
-->
-<p>It&rsquo;s a common misconception that <a
href="https://parquet.apache.org/">Apache Parquet</a> files are
limited to basic Min/Max/Null Count statistics and Bloom filters, and that
adding more advanced indexes requires changing the specification or creating a
new file format. In fact, footer metadata and offset-based addressing already
provide everything needed to embed …</p></description><dc:creator
xmlns:dc="http://purl.org/dc/elements/1.1/">Qi Zhu, Jigao [...]
\ No newline at end of file
+<p>It&rsquo;s a common misconception that <a
href="https://parquet.apache.org/">Apache Parquet</a> files are
limited to basic Min/Max/Null Count statistics and Bloom filters, and that
adding more advanced indexes requires changing the specification or creating a
new file format. In fact, footer metadata and offset-based addressing already
provide everything needed to embed …</p></description><dc:creator
xmlns:dc="http://purl.org/dc/elements/1.1/">Qi Zhu (Cloud [...]
\ No newline at end of file
diff --git a/output/index.html b/output/index.html
index 3165a1f..d156197 100644
--- a/output/index.html
+++ b/output/index.html
@@ -51,7 +51,7 @@
<header>
<div class="title">
<h1><a
href="/blog/2025/07/14/user-defined-parquet-indexes">Embedding User-Defined
Indexes in Apache Parquet Files</a></h1>
- <p>Posted on: Mon 14 July 2025 by Qi Zhu, Jigao Luo,
and Andrew Lamb</p>
+ <p>Posted on: Mon 14 July 2025 by Qi Zhu (Cloudera),
Jigao Luo (Systems Group at TU Darmstadt), and Andrew Lamb (InfluxData)</p>
<p><!--
{% comment %}
Licensed to the Apache Software Foundation (ASF) under one or more
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]