This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datafusion-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 331b655  Commit build products
331b655 is described below

commit 331b655d7ce74e587895cf28791f105b69f51128
Author: Build Pelican (action) <[email protected]>
AuthorDate: Thu Jul 17 19:23:26 2025 +0000

    Commit build products
---
 output/2025/07/14/user-defined-parquet-indexes/index.html            | 3 ++-
 output/feeds/all-en.atom.xml                                         | 5 +++--
 output/feeds/blog.atom.xml                                           | 5 +++--
 ...systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.atom.xml | 5 +++--
 4 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/output/2025/07/14/user-defined-parquet-indexes/index.html 
b/output/2025/07/14/user-defined-parquet-indexes/index.html
index 9b74b62..266ce01 100644
--- a/output/2025/07/14/user-defined-parquet-indexes/index.html
+++ b/output/2025/07/14/user-defined-parquet-indexes/index.html
@@ -103,7 +103,7 @@ limitations under the License.
 <p>Modern Parquet writers create these indexes automatically and provide APIs 
to control their generation and placement. For example, the <a 
href="https://docs.rs/parquet/latest/parquet/";>Rust Parquet Library</a> 
provides <a 
href="https://docs.rs/parquet/latest/parquet/file/properties/struct.WriterProperties.html";>Parquet
 WriterProperties</a>, <a 
href="https://docs.rs/parquet/latest/parquet/file/properties/enum.EnabledStatistics.html";>EnabledStatistics</a>,
 and <a href="https://docs.rs/p [...]
 <h2>Embedding User Defined Indexes in Parquet Files</h2>
 <hr/>
-<p>Embedding user-defined indexes in Parquet files is straightforward and 
follows the same principles as standard index structures:</p>
+<p>Embedding user-defined indexes in Parquet files is straightforward and 
follows the same principles as standard index structures<sup><a 
href="#footnote6">6</a></sup>:</p>
 <ol>
 <li>
 <p>Serialize the index into a binary format and write it into the file body 
before the Thrift-encoded footer metadata.</p>
@@ -513,6 +513,7 @@ it out, we would love for you to join us.</p>
 <p><a id="footnote3"></a><code>3</code>: <a 
href="https://dl.gi.de/items/2a8571f8-0ef2-481c-8ee9-05f82ee258c8";>Seamless 
Integration of Parquet Files into Data Processing. / Rey, Alice; Freitag, 
Michael; Neumann, Thomas. / BTW 2023</a></p>
 <p><a id="footnote4"></a><code>4</code>: For more information about external 
indexes, see <a href="https://www.youtube.com/watch?v=74YsJT1-Rdk";>this 
talk</a> and the <a 
href="https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/parquet_index.rs";>parquet_index.rs</a>
 and <a 
href="https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/advanced_parquet_index.rs";>advanced_parquet_index.rs</a>
 examples in the DataFusion repository.</p>
 <p><a id="footnote5"></a><code>5</code>: For information about rewriting files 
to optimize for specific queries, such as resorting, repartitioning, and tuning 
data page and row group sizes, see <a 
href="https://github.com/XiangpengHao/liquid-cache/issues/227";>XiangpengHao/liquid‑cache#227</a>
 and the conversation between <a 
href="https://github.com/JigaoLuo";>JigaoLuo</a> and <a 
href="https://github.com/XiangpengHao";>XiangpengHao</a> for details. We hope to 
make a future post about this t [...]
+<p><a id="footnote6"></a><code>6</code>: An index can also be stored inline in 
the key-value metadata. This approach is simple to implement and ensures the 
index is available once the footer is read, without additional I/O. However, it 
requires the index to be serialized as a UTF-8 string, which may be less 
efficient and increases the size of the footer metadata, impacting all Parquet 
readers, even those that ignore the index.</p>
         </div>
       </div>
     </div>    
diff --git a/output/feeds/all-en.atom.xml b/output/feeds/all-en.atom.xml
index f74d67e..3d9c139 100644
--- a/output/feeds/all-en.atom.xml
+++ b/output/feeds/all-en.atom.xml
@@ -271,7 +271,7 @@ limitations under the License.
 &lt;p&gt;Modern Parquet writers create these indexes automatically and provide 
APIs to control their generation and placement. For example, the &lt;a 
href="https://docs.rs/parquet/latest/parquet/"&gt;Rust Parquet 
Library&lt;/a&gt; provides &lt;a 
href="https://docs.rs/parquet/latest/parquet/file/properties/struct.WriterProperties.html"&gt;Parquet
 WriterProperties&lt;/a&gt;, &lt;a 
href="https://docs.rs/parquet/latest/parquet/file/properties/enum.EnabledStatistics.html"&gt;EnabledStatistics
 [...]
 &lt;h2&gt;Embedding User Defined Indexes in Parquet Files&lt;/h2&gt;
 &lt;hr/&gt;
-&lt;p&gt;Embedding user-defined indexes in Parquet files is straightforward 
and follows the same principles as standard index structures:&lt;/p&gt;
+&lt;p&gt;Embedding user-defined indexes in Parquet files is straightforward 
and follows the same principles as standard index structures&lt;sup&gt;&lt;a 
href="#footnote6"&gt;6&lt;/a&gt;&lt;/sup&gt;:&lt;/p&gt;
 &lt;ol&gt;
 &lt;li&gt;
 &lt;p&gt;Serialize the index into a binary format and write it into the file 
body before the Thrift-encoded footer metadata.&lt;/p&gt;
@@ -680,7 +680,8 @@ it out, we would love for you to join us.&lt;/p&gt;
 &lt;p&gt;&lt;a id="footnote2"&gt;&lt;/a&gt;&lt;code&gt;2&lt;/code&gt;: There 
are other index structures, but they are either 1) not widely supported (such 
as statistics in the page headers) or 2) not yet widely used in practice at the 
time of this writing (such as &lt;a 
href="https://github.com/apache/parquet-format/blob/819adce0ec6aa848e56c56f20b9347f4ab50857f/src/main/thrift/parquet.thrift#L256"&gt;GeospatialStatistics&lt;/a&gt;
 and &lt;a href="https://github.com/apache/parquet-format/ [...]
 &lt;p&gt;&lt;a id="footnote3"&gt;&lt;/a&gt;&lt;code&gt;3&lt;/code&gt;: &lt;a 
href="https://dl.gi.de/items/2a8571f8-0ef2-481c-8ee9-05f82ee258c8"&gt;Seamless 
Integration of Parquet Files into Data Processing. / Rey, Alice; Freitag, 
Michael; Neumann, Thomas. / BTW 2023&lt;/a&gt;&lt;/p&gt;
 &lt;p&gt;&lt;a id="footnote4"&gt;&lt;/a&gt;&lt;code&gt;4&lt;/code&gt;: For 
more information about external indexes, see &lt;a 
href="https://www.youtube.com/watch?v=74YsJT1-Rdk"&gt;this talk&lt;/a&gt; and 
the &lt;a 
href="https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/parquet_index.rs"&gt;parquet_index.rs&lt;/a&gt;
 and &lt;a 
href="https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/advanced_parquet_index.rs"&gt;advanced_parquet_index.rs&;
 [...]
-&lt;p&gt;&lt;a id="footnote5"&gt;&lt;/a&gt;&lt;code&gt;5&lt;/code&gt;: For 
information about rewriting files to optimize for specific queries, such as 
resorting, repartitioning, and tuning data page and row group sizes, see &lt;a 
href="https://github.com/XiangpengHao/liquid-cache/issues/227"&gt;XiangpengHao/liquid‑cache#227&lt;/a&gt;
 and the conversation between &lt;a 
href="https://github.com/JigaoLuo"&gt;JigaoLuo&lt;/a&gt; and &lt;a 
href="https://github.com/XiangpengHao"&gt;XiangpengHao [...]
+&lt;p&gt;&lt;a id="footnote5"&gt;&lt;/a&gt;&lt;code&gt;5&lt;/code&gt;: For 
information about rewriting files to optimize for specific queries, such as 
resorting, repartitioning, and tuning data page and row group sizes, see &lt;a 
href="https://github.com/XiangpengHao/liquid-cache/issues/227"&gt;XiangpengHao/liquid‑cache#227&lt;/a&gt;
 and the conversation between &lt;a 
href="https://github.com/JigaoLuo"&gt;JigaoLuo&lt;/a&gt; and &lt;a 
href="https://github.com/XiangpengHao"&gt;XiangpengHao [...]
+&lt;p&gt;&lt;a id="footnote6"&gt;&lt;/a&gt;&lt;code&gt;6&lt;/code&gt;: An 
index can also be stored inline in the key-value metadata. This approach is 
simple to implement and ensures the index is available once the footer is read, 
without additional I/O. However, it requires the index to be serialized as a 
UTF-8 string, which may be less efficient and increases the size of the footer 
metadata, impacting all Parquet readers, even those that ignore the 
index.&lt;/p&gt;</content><category te [...]
 {% comment %}
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
diff --git a/output/feeds/blog.atom.xml b/output/feeds/blog.atom.xml
index 432770f..bb30323 100644
--- a/output/feeds/blog.atom.xml
+++ b/output/feeds/blog.atom.xml
@@ -271,7 +271,7 @@ limitations under the License.
 &lt;p&gt;Modern Parquet writers create these indexes automatically and provide 
APIs to control their generation and placement. For example, the &lt;a 
href="https://docs.rs/parquet/latest/parquet/"&gt;Rust Parquet 
Library&lt;/a&gt; provides &lt;a 
href="https://docs.rs/parquet/latest/parquet/file/properties/struct.WriterProperties.html"&gt;Parquet
 WriterProperties&lt;/a&gt;, &lt;a 
href="https://docs.rs/parquet/latest/parquet/file/properties/enum.EnabledStatistics.html"&gt;EnabledStatistics
 [...]
 &lt;h2&gt;Embedding User Defined Indexes in Parquet Files&lt;/h2&gt;
 &lt;hr/&gt;
-&lt;p&gt;Embedding user-defined indexes in Parquet files is straightforward 
and follows the same principles as standard index structures:&lt;/p&gt;
+&lt;p&gt;Embedding user-defined indexes in Parquet files is straightforward 
and follows the same principles as standard index structures&lt;sup&gt;&lt;a 
href="#footnote6"&gt;6&lt;/a&gt;&lt;/sup&gt;:&lt;/p&gt;
 &lt;ol&gt;
 &lt;li&gt;
 &lt;p&gt;Serialize the index into a binary format and write it into the file 
body before the Thrift-encoded footer metadata.&lt;/p&gt;
@@ -680,7 +680,8 @@ it out, we would love for you to join us.&lt;/p&gt;
 &lt;p&gt;&lt;a id="footnote2"&gt;&lt;/a&gt;&lt;code&gt;2&lt;/code&gt;: There 
are other index structures, but they are either 1) not widely supported (such 
as statistics in the page headers) or 2) not yet widely used in practice at the 
time of this writing (such as &lt;a 
href="https://github.com/apache/parquet-format/blob/819adce0ec6aa848e56c56f20b9347f4ab50857f/src/main/thrift/parquet.thrift#L256"&gt;GeospatialStatistics&lt;/a&gt;
 and &lt;a href="https://github.com/apache/parquet-format/ [...]
 &lt;p&gt;&lt;a id="footnote3"&gt;&lt;/a&gt;&lt;code&gt;3&lt;/code&gt;: &lt;a 
href="https://dl.gi.de/items/2a8571f8-0ef2-481c-8ee9-05f82ee258c8"&gt;Seamless 
Integration of Parquet Files into Data Processing. / Rey, Alice; Freitag, 
Michael; Neumann, Thomas. / BTW 2023&lt;/a&gt;&lt;/p&gt;
 &lt;p&gt;&lt;a id="footnote4"&gt;&lt;/a&gt;&lt;code&gt;4&lt;/code&gt;: For 
more information about external indexes, see &lt;a 
href="https://www.youtube.com/watch?v=74YsJT1-Rdk"&gt;this talk&lt;/a&gt; and 
the &lt;a 
href="https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/parquet_index.rs"&gt;parquet_index.rs&lt;/a&gt;
 and &lt;a 
href="https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/advanced_parquet_index.rs"&gt;advanced_parquet_index.rs&;
 [...]
-&lt;p&gt;&lt;a id="footnote5"&gt;&lt;/a&gt;&lt;code&gt;5&lt;/code&gt;: For 
information about rewriting files to optimize for specific queries, such as 
resorting, repartitioning, and tuning data page and row group sizes, see &lt;a 
href="https://github.com/XiangpengHao/liquid-cache/issues/227"&gt;XiangpengHao/liquid‑cache#227&lt;/a&gt;
 and the conversation between &lt;a 
href="https://github.com/JigaoLuo"&gt;JigaoLuo&lt;/a&gt; and &lt;a 
href="https://github.com/XiangpengHao"&gt;XiangpengHao [...]
+&lt;p&gt;&lt;a id="footnote5"&gt;&lt;/a&gt;&lt;code&gt;5&lt;/code&gt;: For 
information about rewriting files to optimize for specific queries, such as 
resorting, repartitioning, and tuning data page and row group sizes, see &lt;a 
href="https://github.com/XiangpengHao/liquid-cache/issues/227"&gt;XiangpengHao/liquid‑cache#227&lt;/a&gt;
 and the conversation between &lt;a 
href="https://github.com/JigaoLuo"&gt;JigaoLuo&lt;/a&gt; and &lt;a 
href="https://github.com/XiangpengHao"&gt;XiangpengHao [...]
+&lt;p&gt;&lt;a id="footnote6"&gt;&lt;/a&gt;&lt;code&gt;6&lt;/code&gt;: An 
index can also be stored inline in the key-value metadata. This approach is 
simple to implement and ensures the index is available once the footer is read, 
without additional I/O. However, it requires the index to be serialized as a 
UTF-8 string, which may be less efficient and increases the size of the footer 
metadata, impacting all Parquet readers, even those that ignore the 
index.&lt;/p&gt;</content><category te [...]
 {% comment %}
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
diff --git 
a/output/feeds/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.atom.xml
 
b/output/feeds/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.atom.xml
index 911bfd1..278b9d1 100644
--- 
a/output/feeds/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.atom.xml
+++ 
b/output/feeds/qi-zhu-cloudera-jigao-luo-systems-group-at-tu-darmstadt-and-andrew-lamb-influxdata.atom.xml
@@ -77,7 +77,7 @@ limitations under the License.
 &lt;p&gt;Modern Parquet writers create these indexes automatically and provide 
APIs to control their generation and placement. For example, the &lt;a 
href="https://docs.rs/parquet/latest/parquet/"&gt;Rust Parquet 
Library&lt;/a&gt; provides &lt;a 
href="https://docs.rs/parquet/latest/parquet/file/properties/struct.WriterProperties.html"&gt;Parquet
 WriterProperties&lt;/a&gt;, &lt;a 
href="https://docs.rs/parquet/latest/parquet/file/properties/enum.EnabledStatistics.html"&gt;EnabledStatistics
 [...]
 &lt;h2&gt;Embedding User Defined Indexes in Parquet Files&lt;/h2&gt;
 &lt;hr/&gt;
-&lt;p&gt;Embedding user-defined indexes in Parquet files is straightforward 
and follows the same principles as standard index structures:&lt;/p&gt;
+&lt;p&gt;Embedding user-defined indexes in Parquet files is straightforward 
and follows the same principles as standard index structures&lt;sup&gt;&lt;a 
href="#footnote6"&gt;6&lt;/a&gt;&lt;/sup&gt;:&lt;/p&gt;
 &lt;ol&gt;
 &lt;li&gt;
 &lt;p&gt;Serialize the index into a binary format and write it into the file 
body before the Thrift-encoded footer metadata.&lt;/p&gt;
@@ -486,4 +486,5 @@ it out, we would love for you to join us.&lt;/p&gt;
 &lt;p&gt;&lt;a id="footnote2"&gt;&lt;/a&gt;&lt;code&gt;2&lt;/code&gt;: There 
are other index structures, but they are either 1) not widely supported (such 
as statistics in the page headers) or 2) not yet widely used in practice at the 
time of this writing (such as &lt;a 
href="https://github.com/apache/parquet-format/blob/819adce0ec6aa848e56c56f20b9347f4ab50857f/src/main/thrift/parquet.thrift#L256"&gt;GeospatialStatistics&lt;/a&gt;
 and &lt;a href="https://github.com/apache/parquet-format/ [...]
 &lt;p&gt;&lt;a id="footnote3"&gt;&lt;/a&gt;&lt;code&gt;3&lt;/code&gt;: &lt;a 
href="https://dl.gi.de/items/2a8571f8-0ef2-481c-8ee9-05f82ee258c8"&gt;Seamless 
Integration of Parquet Files into Data Processing. / Rey, Alice; Freitag, 
Michael; Neumann, Thomas. / BTW 2023&lt;/a&gt;&lt;/p&gt;
 &lt;p&gt;&lt;a id="footnote4"&gt;&lt;/a&gt;&lt;code&gt;4&lt;/code&gt;: For 
more information about external indexes, see &lt;a 
href="https://www.youtube.com/watch?v=74YsJT1-Rdk"&gt;this talk&lt;/a&gt; and 
the &lt;a 
href="https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/parquet_index.rs"&gt;parquet_index.rs&lt;/a&gt;
 and &lt;a 
href="https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/advanced_parquet_index.rs"&gt;advanced_parquet_index.rs&;
 [...]
-&lt;p&gt;&lt;a id="footnote5"&gt;&lt;/a&gt;&lt;code&gt;5&lt;/code&gt;: For 
information about rewriting files to optimize for specific queries, such as 
resorting, repartitioning, and tuning data page and row group sizes, see &lt;a 
href="https://github.com/XiangpengHao/liquid-cache/issues/227"&gt;XiangpengHao/liquid‑cache#227&lt;/a&gt;
 and the conversation between &lt;a 
href="https://github.com/JigaoLuo"&gt;JigaoLuo&lt;/a&gt; and &lt;a 
href="https://github.com/XiangpengHao"&gt;XiangpengHao [...]
\ No newline at end of file
+&lt;p&gt;&lt;a id="footnote5"&gt;&lt;/a&gt;&lt;code&gt;5&lt;/code&gt;: For 
information about rewriting files to optimize for specific queries, such as 
resorting, repartitioning, and tuning data page and row group sizes, see &lt;a 
href="https://github.com/XiangpengHao/liquid-cache/issues/227"&gt;XiangpengHao/liquid‑cache#227&lt;/a&gt;
 and the conversation between &lt;a 
href="https://github.com/JigaoLuo"&gt;JigaoLuo&lt;/a&gt; and &lt;a 
href="https://github.com/XiangpengHao"&gt;XiangpengHao [...]
+&lt;p&gt;&lt;a id="footnote6"&gt;&lt;/a&gt;&lt;code&gt;6&lt;/code&gt;: An 
index can also be stored inline in the key-value metadata. This approach is 
simple to implement and ensures the index is available once the footer is read, 
without additional I/O. However, it requires the index to be serialized as a 
UTF-8 string, which may be less efficient and increases the size of the footer 
metadata, impacting all Parquet readers, even those that ignore the 
index.&lt;/p&gt;</content><category te [...]
\ No newline at end of file


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to