Re: Orcus 0.19.0, Apache Arrow and CMake

2023-11-13 Thread Kohei Yoshida



On 9/28/23 21:16, Kohei Yoshida wrote:
As an aside, if someone wants to try out this experimental parquet 
import filter, one can build orcus independently with 
--with-parquet-filter passed to configure after also having built the 
apache arrow library, apply this change


https://github.com/kohei-us/core/commit/ae1390947246e44a6cd3d9b8af8c46b60619a698 



then build libreoffice with --with-system-orcus.  Then you should be 
able to simply open a parquet file and Calc should open. 


If you use orcus 0.19.1 or newer, you won't need to apply this custom 
change at all.  If you just build orcus with --with-paquet-filter then 
build libreoffice with --with-system-orcus, Calc should be able to open 
parquet files.


As for the idea of integrating CMake into our build system, I briefly 
looked into it, but I won't be working on it myself. It's simply beyond 
my level of expertise and interest.


Best,
Kohei



7.6s sc_subsequent_filters_test and orcus 0.19 (was: Re: Orcus 0.19.0, Apache Arrow and CMake)

2023-10-29 Thread Rene Engelhard

Hi,

Am 29.09.23 um 03:16 schrieb Kohei Yoshida:
I just upgraded on the master branch the orcus library to the latest 
version of 0.19.0.  This version includes an experimental support for 
Apache Parquet format import, but it relies on an external library 
called Apache Arrow[1].  The support for this file format is disabled 
in the libreoffice build.


I uploaded 0.19 to Debian experimental (which as of now only has LO 
7.6.x) since it's API/ABI compatible and the tests fail:


[build CUT] sc_subsequent_filters_test
S=/home/rene/LibreOffice/git/libreoffice-7-6 && I=$S/instdir && 
W=$S/workdir &&  mkdir -p $W/CppunitTest/ && rm -fr 
$W/CppunitTest/sc_subsequent_filters_test.test.user && cp -r $W/unittest 
$W/CppunitTest/sc_subsequent_filters_test.test.user &&    rm -fr 
$W/CppunitTest/sc_subsequent_filters_test.test.core && mkdir 
$W/CppunitTest/sc_subsequent_filters_test.test.core && cd 
$W/CppunitTest/sc_subsequent_filters_test.test.core &&  (  
MAX_CONCURRENCY=4 MOZILLA_CERTIFICATE_FOLDER=dbm: 
SAL_DISABLE_SYNCHRONOUS_PRINTER_DETECTION=1 SAL_USE_VCLPLUGIN=svp 
LIBO_LANG=C 
LD_LIBRARY_PATH=${LD_LIBRARY_PATH:+$LD_LIBRARY_PATH:}"$I/program:$I/program":$W/UnpackedTarball/cppunit/src/cppunit/.libs 
$W/LinkTarget/Executable/cppunittester 
$W/LinkTarget/CppunitTest/libtest_sc_subsequent_filters_test.so 
--headless "-env:BRAND_BASE_DIR=file://$S/instdir" 
"-env:BRAND_SHARE_SUBDIR=share" 
"-env:BRAND_SHARE_RESOURCE_SUBDIR=program/resource" 
"-env:UserInstallation=file://$W/CppunitTest/sc_subsequent_filters_test.test.user" 
"-env:CONFIGURATION_LAYERS=xcsxcu:file://$I/share/registry 
xcsxcu:file://$W/unittest/registry-common 
xcsxcu:file://$W/unittest/registry-user-ui" 
"-env:UNO_TYPES=file://$I/program/types.rdb 
file://$I/program/types/offapi.rdb file://$I/program/types/oovbaapi.rdb" 
"-env:UNO_SERVICES=file://$W/Rdb/ure/services.rdb 
file://$W/Rdb/services.rdb" -env:URE_BIN_DIR=file://$I/program 
-env:URE_INTERNAL_LIB_DIR=file://$I/program 
-env:LO_LIB_DIR=file://$I/program 
-env:LO_JAVA_DIR=file://$I/program/classes --protector 
$W/LinkTarget/Library/unoexceptionprotector.so unoexceptionprotector 
--protector $W/LinkTarget/Library/unobootstrapprotector.so 
unobootstrapprotector   --protector 
$W/LinkTarget/Library/libvclbootstrapprotector.so vclbootstrapprotector 
-env:arg-env=LD_LIBRARY_PATH"${LD_LIBRARY_PATH+=$LD_LIBRARY_PATH}" 
"-env:CPPUNITTESTTARGET=$W/CppunitTest/sc_subsequent_filters_test.test" 
) 2>&1

[_RUN_] testBasicCellContentODS::TestBody
testBasicCellContentODS::TestBody finished in: 209ms
[_RUN_] testBooleanFormatXLSX::TestBody
testBooleanFormatXLSX::TestBody finished in: 48ms
[_RUN_] testBorderODS::TestBody
testBorderODS::TestBody finished in: 33ms
[_RUN_] testBordersOoo33::TestBody
testBordersOoo33::TestBody finished in: 35ms
[_RUN_] testBrokenQuotesCSV::TestBody
testBrokenQuotesCSV::TestBody finished in: 51ms
[_RUN_] testBugFixesODS::TestBody
testBugFixesODS::TestBody finished in: 37ms
[_RUN_] testBugFixesXLS::TestBody
testBugFixesXLS::TestBody finished in: 34ms
[_RUN_] testBugFixesXLSX::TestBody
testBugFixesXLSX::TestBody finished in: 56ms
[_RUN_] testCachedFormulaResultsODS::TestBody
testCachedFormulaResultsODS::TestBody finished in: 123ms
[_RUN_] testCachedMatrixFormulaResultsODS::TestBody
testCachedMatrixFormulaResultsODS::TestBody finished in: 41ms
[_RUN_] testCeilingFloorXLSX::TestBody
testCeilingFloorXLSX::TestBody finished in: 42ms
[_RUN_] testCellValueXLSX::TestBody
testCellValueXLSX::TestBody finished in: 65ms
[_RUN_] testCondFormatBeginsAndEndsWithXLSX::TestBody
testCondFormatBeginsAndEndsWithXLSX::TestBody finished in: 49ms
[_RUN_] testCondFormatCfvoScaleValueXLSX::TestBody
testCondFormatCfvoScaleValueXLSX::TestBody finished in: 40ms
[_RUN_] testCondFormatFormulaIsXLSX::TestBody
testCondFormatFormulaIsXLSX::TestBody finished in: 47ms
[_RUN_] testCondFormatOperatorsSameRangeXLSX::TestBody
testCondFormatOperatorsSameRangeXLSX::TestBody finished in: 38ms
[_RUN_] testContentDIF::TestBody
testContentDIF::TestBody finished in: 212ms
[_RUN_] testContentGnumeric::TestBody
Segmentation fault (core dumped)
make[4]: *** 
[/home/rene/LibreOffice/git/libreoffice-7-6/solenv/gbuild/CppunitTest.mk:130: 
/home/rene/LibreOffice/git/libreoffice-7-6/workdir/CppunitTest/sc_subsequent_filters_test.test] 
Error 139


I first assumed that this was due to 
https://cgit.freedesktop.org/libreoffice/core/commit/?id=cba8c933d1ff2e31ec55544f46d6fff99e8a5ccd 
but even with liborcus 0.19.1 (which includes that one) it fails.


This doesn't fail on master, so I'd assume I miss some other (maybe even 
not directly related) patch?



Any idea? Tried to get a bt out of gdb but failed - apparently it 
doesn't happen there?!


(at least got

[Inferior 1 (process 4067166) exited with code 01]
(gdb) bt
No stack.
(gdb)
)


Regards,


Rene



Re: Orcus 0.19.0, Apache Arrow and CMake

2023-10-04 Thread Miklos Vajna
Hi Kohei,

On Tue, Oct 03, 2023 at 06:05:25PM -0400, Kohei Yoshida  
wrote:
> Thanks for the info.  This is very helpful.  Does that mean we can consider
> cmake to be always available during the build, or is it still considered
> optional?  IIRC doxygen is an optional build component.  So, in theory one
> can still build libreoffice without cmake by disabling doxygen?
> 
> configure doesn't (seem to) check for the availability of cmake executable.

Yes, you're right, doxygen (and thus cmake) is currently optional.

I assume (but this may not be true) that nowadays most people either
know how to install doxygen/cmake/other build-time dependencies or just
use lode.git, which already installs cmake for you.

So it may not be too painful to start hard-depending on cmake, apart
from adding a configure check.

Regards,

Miklos


Re: Orcus 0.19.0, Apache Arrow and CMake

2023-10-03 Thread Kohei Yoshida

Hi Miklos,

On 10/3/23 02:21, Miklos Vajna wrote:

Hi Kohei,

On Mon, Oct 02, 2023 at 07:06:01PM -0400, Kohei Yoshida  
wrote:

That's good to know.  Let me think about this for a bit.  The truth is that
I'm not 100% sure whether I can commit to working on adding CMake executable
as an additional build requirement myself.  I can imagine even adding an
additional executable to our current buildsystem can be very complex and
tedious.

Perhaps I miss something, but doxygen is already a built-time
dependency, building with cmake. So lode.git has code to install cmake,
see install_private_cmake() in bin/utils.sh in lode.git.


Thanks for the info.  This is very helpful.  Does that mean we can 
consider cmake to be always available during the build, or is it still 
considered optional?  IIRC doxygen is an optional build component.  So, 
in theory one can still build libreoffice without cmake by disabling 
doxygen?


configure doesn't (seem to) check for the availability of cmake executable.

Kohei


Re: Orcus 0.19.0, Apache Arrow and CMake

2023-10-03 Thread Miklos Vajna
Hi Kohei,

On Mon, Oct 02, 2023 at 07:06:01PM -0400, Kohei Yoshida  
wrote:
> That's good to know.  Let me think about this for a bit.  The truth is that
> I'm not 100% sure whether I can commit to working on adding CMake executable
> as an additional build requirement myself.  I can imagine even adding an
> additional executable to our current buildsystem can be very complex and
> tedious.

Perhaps I miss something, but doxygen is already a built-time
dependency, building with cmake. So lode.git has code to install cmake,
see install_private_cmake() in bin/utils.sh in lode.git.

So it may be easier than you think.

Regards,

Miklos


Re: Orcus 0.19.0, Apache Arrow and CMake

2023-10-02 Thread Kohei Yoshida

Hi Michael,

On 9/29/23 05:33, Michael Stahl wrote:

hi Kohei,

On 29/09/2023 03:16, Kohei Yoshida wrote:
Now, it's my understanding that we still don't support use of CMake 
to build our external libraries. My question is, what do people think 
of adding support for CMake in our build system? Would that be too 
much effort and not worth it, or would it be worthwhile to add 
support for it, but so far it has not been anybody's priority, or ... ?


Please let me know what your opinions are.


so as far as i remember, the reason why CMake isn't required currently 
is mainly that nobody so far wanted to add it as yet another build 
dependency that people have to install - there isn't anything 
inherently wrong with the idea and it could be called from makefiles 
just like other external build systems.


That's good to know.  Let me think about this for a bit.  The truth is 
that I'm not 100% sure whether I can commit to working on adding CMake 
executable as an additional build requirement myself.  I can imagine 
even adding an additional executable to our current buildsystem can be 
very complex and tedious.


Hmm...

Kohei



Re: Orcus 0.19.0, Apache Arrow and CMake

2023-09-29 Thread Michael Stahl

hi Kohei,

On 29/09/2023 03:16, Kohei Yoshida wrote:
Now, it's my understanding that we still don't support use of CMake to 
build our external libraries. My question is, what do people think of 
adding support for CMake in our build system? Would that be too much 
effort and not worth it, or would it be worthwhile to add support for 
it, but so far it has not been anybody's priority, or ... ?


Please let me know what your opinions are.


so as far as i remember, the reason why CMake isn't required currently 
is mainly that nobody so far wanted to add it as yet another build 
dependency that people have to install - there isn't anything inherently 
wrong with the idea and it could be called from makefiles just like 
other external build systems.


in some cases, there are specific problems with particular external 
project's CMakefiles - looking at poppler, which has various parameters 
to disable external library dependencies, some of which apparently do 
not work, so the CMake-generated config headers require patching.




Re: Orcus 0.19.0, Apache Arrow and CMake

2023-09-28 Thread Ilmari Lauhakangas

On 29.9.2023 4.16, Kohei Yoshida wrote:

Hi there,

I just upgraded on the master branch the orcus library to the latest 
version of 0.19.0.  This version includes an experimental support for 
Apache Parquet format import, but it relies on an external library 
called Apache Arrow[1].  The support for this file format is disabled in 
the libreoffice build.


To enable support for Parquet, we would need to also build the Apache 
Arrow library and its dependency libraries, all of which use CMake as 
their build systems.


Now, it's my understanding that we still don't support use of CMake to 
build our external libraries. My question is, what do people think of 
adding support for CMake in our build system? Would that be too much 
effort and not worth it, or would it be worthwhile to add support for 
it, but so far it has not been anybody's priority, or ... ?


Please let me know what your opinions are.

As an aside, if someone wants to try out this experimental parquet 
import filter, one can build orcus independently with 
--with-parquet-filter passed to configure after also having built the 
apache arrow library, apply this change


https://github.com/kohei-us/core/commit/ae1390947246e44a6cd3d9b8af8c46b60619a698

then build libreoffice with --with-system-orcus.  Then you should be 
able to simply open a parquet file and Calc should open.


Kohei

[1] https://github.com/apache/arrow


Hi,

I would be happy to see work on Meson support continue and its CMake 
module should solve what you propose.

https://mesonbuild.com/CMake-module.html#cmake-module
"It also supports the usage of CMake based subprojects"

Ilmari


Orcus 0.19.0, Apache Arrow and CMake

2023-09-28 Thread Kohei Yoshida

Hi there,

I just upgraded on the master branch the orcus library to the latest 
version of 0.19.0.  This version includes an experimental support for 
Apache Parquet format import, but it relies on an external library 
called Apache Arrow[1].  The support for this file format is disabled in 
the libreoffice build.


To enable support for Parquet, we would need to also build the Apache 
Arrow library and its dependency libraries, all of which use CMake as 
their build systems.


Now, it's my understanding that we still don't support use of CMake to 
build our external libraries. My question is, what do people think of 
adding support for CMake in our build system? Would that be too much 
effort and not worth it, or would it be worthwhile to add support for 
it, but so far it has not been anybody's priority, or ... ?


Please let me know what your opinions are.

As an aside, if someone wants to try out this experimental parquet 
import filter, one can build orcus independently with 
--with-parquet-filter passed to configure after also having built the 
apache arrow library, apply this change


https://github.com/kohei-us/core/commit/ae1390947246e44a6cd3d9b8af8c46b60619a698

then build libreoffice with --with-system-orcus.  Then you should be 
able to simply open a parquet file and Calc should open.


Kohei

[1] https://github.com/apache/arrow