Re: Alluxio cache read support
Hi Sutou Kouhei/Team *[Background]* Working on intel gazelle_plugin <https://github.com/oap-project/gazelle_plugin>, It's a C++ based backend with an arrow compute engine for spark. Now during scan i.e reading data from HDFS/Cloud currently we are using cloud/hdfs APIs as mentioned above. But now we have Alluxio Cache <https://docs.alluxio.io/ee/user/stable/en/core-services/Caching.html> in between for fast data access. *[Problem]* HDFS/Cloud > Alluxio > arrow FS api ---> arrow parquet scan *[Need help]* Below connection [ Alluxio-> arrow FS api ] On Tue, 6 Sept 2022 at 02:17, Sutou Kouhei wrote: > Hi, > > Could you try our HDFS support? > > * > https://arrow.apache.org/docs/cpp/dataset.html#reading-from-cloud-storage > * > https://arrow.apache.org/docs/cpp/api/filesystem.html#_CPPv4N5arrow2fs16HadoopFileSystemE > > (You're using Apache Arrow C++, right?) > > > Thanks, > -- > kou > > In > "Alluxio cache read support" on Mon, 5 Sep 2022 16:58:57 +0530, > Manoj Kumar wrote: > > > Hi Team, > > > > Anyone know how to access HDFS/cloud FS backend by Alluxio via the arrow > > filesystem ? >
Alluxio cache read support
Hi Team, Anyone know how to access HDFS/cloud FS backend by Alluxio via the arrow filesystem ?
Re: HDFS ORC to Arrow Dataset
On further analysis :- ==114164== Process terminating with default action of signal 6 (SIGABRT) ==114164==at 0x4AD118B: raise (raise.c:51) ==114164==by 0x4AB092D: abort (abort.c:100) ==114164==by 0x598D768: os::abort(bool) (in /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so) ==114164==by 0x5B52802: VMError::report_and_die() (in /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so) ==114164==by 0x59979F4: JVM_handle_linux_signal (in /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so) ==114164==by 0x598A8B7: signalHandler(int, siginfo*, void*) (in /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so) ==114164==by 0x485F3BF: ??? (in /usr/lib/x86_64-linux-gnu/ libpthread-2.31.so) ==114164==by 0x5949C26: Monitor::ILock(Thread*) [clone .part.2] (in /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so) ==114164==by 0x594B50A: Monitor::lock_without_safepoint_check() (in /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so) ==114164==by 0x5B59660: VM_Exit::wait_if_vm_exited() (in /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so) ==114164==by 0x574137C: jni_DetachCurrentThread (in /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so) ==114164==by 0x4140AA4E: hdfsThreadDestructor (thread_local_storage.c:53 It turned out that the issue was in libhdfs, so I fixed that. Now ORC JNI also works fine There are many features missing in ORC-JN like reading full split or index based reading etc etc Do we have any plan to support those ? On Wed, 8 Sept 2021 at 22:06, Manoj Kumar wrote: > Hi Wes, > > Thanks, > > *[ Part 1 ]* > *C++ HDFS/ORC [Completed]* > Steps which I followed : > 1) arrow::fs::HadoopFileSystem --> create a hadoop FS > 2) std::shared_ptr -->then create a stream > 3) Pass that stream to adapters::orc::ORCFileReader > > *[Part 2 ]* > *C++ HDFS/ORC via Java JNI [Partial Completed]* > *Follow same approach in orc.jni_wrapper* > 1) arrow::fs::HadoopFileSystem --> create a hadoop FS > 2) std::shared_ptr -->then create a stream > 3) Pass that stream to adapters::orc::ORCFileReader > > ** > std::unique_ptr reader; > arrow::Status ret; > if (path.find("hdfs://") == 0) { > > > > > > > > > > > > > > > * arrow::fs::HdfsOptions options_;options_ = > *arrow::fs::HdfsOptions::FromUri(path);auto _fsRes = > arrow::fs::HadoopFileSystem::Make(options_);if (!_fsRes.ok()) { > std::cerr<< "HadoopFileSystem::Make failed, it > is possible when we don't have " "proper driver on > this node, err msg is "<< _fsRes.status().ToString(); > } _fs = *_fsRes;auto _stream = > *_fs->OpenInputFile(path);hadoop_fs_holder_.Insert(_fs); //global > holder in arrow::jni::ConcurrentMap, cleared during unload ret = > ORCFileReader::Open(* *_stream* > *,arrow::default_memory_pool(), &reader);* > > > > * if (!ret.ok()) {env->ThrowNew(io_exception_class, > std::string("Failed open file" + path).c_str());}* > > * return > orc_reader_holder_.Insert(std::shared_ptr(reader.release()));* > *}* > > > JNI also works fine, but at the end of application, I am getting > segmentation fault. > > *Do you have any idea about , looks like some issue with libhdfs > connection close or cleanup ?* > > *stack trace:* > /tmp/tmp3973555041947319188libarrow_orc_jni.so : ()+0xb8b1a3 > /lib/x86_64-linux-gnu/libpthread.so.0 : ()+0x153c0 > /lib/x86_64-linux-gnu/libc.so.6 : gsignal()+0xcb > /lib/x86_64-linux-gnu/libc.so.6 : abort()+0x12b > > /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so > : ()+0x90e769 > > /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so > : ()+0xad3803 > > /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so > : JVM_handle_linux_signal()+0x1a5 > > /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so > : ()+0x90b8b8 > /lib/x86_64-linux-gnu/libpthread.so.0 : ()+0x153c0 > > /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so > : ()+0x8cac27 > > /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so > : ()+0x8cc50b > > /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so
Re: HDFS ORC to Arrow Dataset
Hi Wes, Thanks, *[ Part 1 ]* *C++ HDFS/ORC [Completed]* Steps which I followed : 1) arrow::fs::HadoopFileSystem --> create a hadoop FS 2) std::shared_ptr -->then create a stream 3) Pass that stream to adapters::orc::ORCFileReader *[Part 2 ]* *C++ HDFS/ORC via Java JNI [Partial Completed]* *Follow same approach in orc.jni_wrapper* 1) arrow::fs::HadoopFileSystem --> create a hadoop FS 2) std::shared_ptr -->then create a stream 3) Pass that stream to adapters::orc::ORCFileReader ** std::unique_ptr reader; arrow::Status ret; if (path.find("hdfs://") == 0) { * arrow::fs::HdfsOptions options_;options_ = *arrow::fs::HdfsOptions::FromUri(path);auto _fsRes = arrow::fs::HadoopFileSystem::Make(options_);if (!_fsRes.ok()) { std::cerr<< "HadoopFileSystem::Make failed, it is possible when we don't have " "proper driver on this node, err msg is "<< _fsRes.status().ToString(); } _fs = *_fsRes;auto _stream = *_fs->OpenInputFile(path);hadoop_fs_holder_.Insert(_fs); //global holder in arrow::jni::ConcurrentMap, cleared during unload ret = ORCFileReader::Open(* *_stream* *,arrow::default_memory_pool(), &reader);* * if (!ret.ok()) {env->ThrowNew(io_exception_class, std::string("Failed open file" + path).c_str());}* * return orc_reader_holder_.Insert(std::shared_ptr(reader.release()));* *}* JNI also works fine, but at the end of application, I am getting segmentation fault. *Do you have any idea about , looks like some issue with libhdfs connection close or cleanup ?* *stack trace:* /tmp/tmp3973555041947319188libarrow_orc_jni.so : ()+0xb8b1a3 /lib/x86_64-linux-gnu/libpthread.so.0 : ()+0x153c0 /lib/x86_64-linux-gnu/libc.so.6 : gsignal()+0xcb /lib/x86_64-linux-gnu/libc.so.6 : abort()+0x12b /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so : ()+0x90e769 /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so : ()+0xad3803 /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so : JVM_handle_linux_signal()+0x1a5 /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so : ()+0x90b8b8 /lib/x86_64-linux-gnu/libpthread.so.0 : ()+0x153c0 /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so : ()+0x8cac27 /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so : ()+0x8cc50b /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so : ()+0xada661 /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so : ()+0x6c237d * /home/legion/ha_devel/hadoop-ecosystem-3x/hadoop-3.1.1/lib/native/libhdfs.so : ()+0xaa4f* /lib/x86_64-linux-gnu/libpthread.so.0 : ()+0x85a1 /lib/x86_64-linux-gnu/libpthread.so.0 : ()+0x962a /lib/x86_64-linux-gnu/libc.so.6 : clone()+0x43 On Wed, 8 Sept 2021 at 04:07, Weston Pace wrote: > I'll just add that a PR in in progress (thanks Joris!) for adding this > adapter: https://github.com/apache/arrow/pull/10991 > > On Tue, Sep 7, 2021 at 12:05 PM Wes McKinney wrote: > > > > I'm missing context but if you're talking about C++/Python, we are > > currently missing a wrapper interface to the ORC reader in the Arrow > > datasets library > > > > https://github.com/apache/arrow/tree/master/cpp/src/arrow/dataset > > > > We have CSV, Arrow (IPC), and Parquet interfaces. > > > > But we have an HDFS filesystem implementation and an ORC reader > > implementation, so mechanically all of the pieces are there but need > > to be connected together. > > > > Thanks, > > Wes > > > > On Tue, Sep 7, 2021 at 8:22 AM Manoj Kumar wrote: > > > > > > Hi Dev-Community, > > > > > > Anyone can help me to guide how to read ORC directly from HDFS to an > > > arrow dataset. > > > > > > Thanks > > > Manoj >
Fwd: HDFS ORC to Arrow Dataset
Hi Dev-Community, Anyone can help me to guide how to read ORC directly from HDFS to an arrow dataset. Thanks Manoj
Re: Installing Arrow
I stringed together a set of installation instructions to help future Python developers. The PR is here https://github.com/apache/arrow/pull/105 On Wed, Jul 6, 2016 at 2:28 PM, Wes McKinney wrote: > You can look at the Travis CI scripts to see the build procedure for > each component: > > https://github.com/apache/arrow/tree/master/ci > > While developing, I usually install things like parquet-cpp in > $HOME/local and set $PARQUET_HOME (or any other dependency env > variables) to that directory. > > We haven't put much effort into dealing with the optional dependencies > of pyarrow, or even deciding what is optional vs required. So for now > installing parquet-cpp is needed for building the Python extensions. > > - Wes > > On Wed, Jul 6, 2016 at 11:05 AM, Manoj Kumar > wrote: > > Hi, > > > > I have problems in installing the Python port of arrow. It seems that it > > requires parquet-cpp. > > > > So I installed parquet-cpp from source using the following instructions > > here (https://github.com/apache/parquet-cpp) > > > > Doing "cmake ." and "make" from the parquet-cpp root directory, seems to > > work hinting that parquet-cpp seems to be installed succesfully. > > > > However, doing "sudo python3 setup.py install" from the python root > > directory fails with the error, > > > > "Could not find the Parquet library. Looked in system search paths." > > > > What do I need to set "PARQUET_HOME" to be to install the python port? > > > > Thanks! > > > > > > > > > > > > On Tue, Jul 5, 2016 at 3:47 PM, Manoj Kumar < > manojkumarsivaraj...@gmail.com> > > wrote: > > > >> Hi Wes, > >> > >> I updated Boost and it works now. > >> > >> I'll send a pull request to make a note for that soon. > >> > >> On Tue, Jul 5, 2016 at 3:30 PM, Wes McKinney > wrote: > >> > >>> Oops, had a keyboarding failure. I got cmake 2.8.12.2 via yum on > >>> CentOS 6 after installing the devtoolset. > >>> > >>> On Tue, Jul 5, 2016 at 3:29 PM, Wes McKinney > wrote: > >>> > hi Manoj, > >>> > > >>> > What is the output of > >>> > > >>> > cmake --version > >>> > > >>> > I installed the RHEL devtoolset using the set of commands in > >>> > > >>> > > >>> > https://github.com/conda-forge/docker-images/blob/master/linux-anvil/Dockerfile > >>> > > >>> > I had to run > >>> > > >>> > scl enable devtoolset-2 bash > >>> > x > >>> > and am able to build the thirdparty on CentOS 6.8. > >>> > > >>> > > >>> > https://github.com/conda-forge/docker-images/blob/master/linux-anvil/Dockerfile > >>> > > >>> > I hit another bug due to old boost-devel in CentOS, so I'd need to > >>> > look more into that > >>> > > >>> > - Wes > >>> > > >>> > On Tue, Jul 5, 2016 at 10:40 AM, Manoj Kumar > >>> > wrote: > >>> >> Hi all, > >>> >> > >>> >> Thanks for the tips. I upgraded gcc to support C++11. > >>> >> > >>> >> g++ --version > >>> >> g++ (GCC) 4.8.2 20140120 (Red Hat 4.8.2-15) > >>> >> Copyright (C) 2013 Free Software Foundation, Inc. > >>> >> > >>> >> But I still get the error. Here is the full traceback: > >>> >> > >>> >> https://gist.github.com/MechCoder/2f97e53c35d36bb132d118f9abc7a255 > >>> >> > >>> >> Any help would be appreciated. > >>> >> > >>> >> > >>> >> > >>> >> On Fri, Jul 1, 2016 at 4:59 PM, Micah Kornfield < > emkornfi...@gmail.com > >>> > > >>> >> wrote: > >>> >> > >>> >>> I think if c++11 was used to compile the library it might also fix > the > >>> >>> issue. > >>> >>> > >>> >>> On Friday, July 1, 2016, Holden Karau > wrote: > >>> >>> > >>> >>> > So doing a bit of hunting it seems like this might be coming > since > >>> your > >>> >>> > missing some expected build libraries, namely you don't have any
[jira] [Created] (ARROW-240) Installation instructions for pyarrow
Manoj Kumar created ARROW-240: - Summary: Installation instructions for pyarrow Key: ARROW-240 URL: https://issues.apache.org/jira/browse/ARROW-240 Project: Apache Arrow Issue Type: Improvement Components: Python Reporter: Manoj Kumar It would be great to have updated installations to build the pyarrow project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Installing Arrow
Hi, I have problems in installing the Python port of arrow. It seems that it requires parquet-cpp. So I installed parquet-cpp from source using the following instructions here (https://github.com/apache/parquet-cpp) Doing "cmake ." and "make" from the parquet-cpp root directory, seems to work hinting that parquet-cpp seems to be installed succesfully. However, doing "sudo python3 setup.py install" from the python root directory fails with the error, "Could not find the Parquet library. Looked in system search paths." What do I need to set "PARQUET_HOME" to be to install the python port? Thanks! On Tue, Jul 5, 2016 at 3:47 PM, Manoj Kumar wrote: > Hi Wes, > > I updated Boost and it works now. > > I'll send a pull request to make a note for that soon. > > On Tue, Jul 5, 2016 at 3:30 PM, Wes McKinney wrote: > >> Oops, had a keyboarding failure. I got cmake 2.8.12.2 via yum on >> CentOS 6 after installing the devtoolset. >> >> On Tue, Jul 5, 2016 at 3:29 PM, Wes McKinney wrote: >> > hi Manoj, >> > >> > What is the output of >> > >> > cmake --version >> > >> > I installed the RHEL devtoolset using the set of commands in >> > >> > >> https://github.com/conda-forge/docker-images/blob/master/linux-anvil/Dockerfile >> > >> > I had to run >> > >> > scl enable devtoolset-2 bash >> > x >> > and am able to build the thirdparty on CentOS 6.8. >> > >> > >> https://github.com/conda-forge/docker-images/blob/master/linux-anvil/Dockerfile >> > >> > I hit another bug due to old boost-devel in CentOS, so I'd need to >> > look more into that >> > >> > - Wes >> > >> > On Tue, Jul 5, 2016 at 10:40 AM, Manoj Kumar >> > wrote: >> >> Hi all, >> >> >> >> Thanks for the tips. I upgraded gcc to support C++11. >> >> >> >> g++ --version >> >> g++ (GCC) 4.8.2 20140120 (Red Hat 4.8.2-15) >> >> Copyright (C) 2013 Free Software Foundation, Inc. >> >> >> >> But I still get the error. Here is the full traceback: >> >> >> >> https://gist.github.com/MechCoder/2f97e53c35d36bb132d118f9abc7a255 >> >> >> >> Any help would be appreciated. >> >> >> >> >> >> >> >> On Fri, Jul 1, 2016 at 4:59 PM, Micah Kornfield > > >> >> wrote: >> >> >> >>> I think if c++11 was used to compile the library it might also fix the >> >>> issue. >> >>> >> >>> On Friday, July 1, 2016, Holden Karau wrote: >> >>> >> >>> > So doing a bit of hunting it seems like this might be coming since >> your >> >>> > missing some expected build libraries, namely you don't have any of >> the >> >>> > expected regex libraries. I'm used to working on ubuntu/debian >> derived >> >>> > systems but it seems like installing something like `yum >> groupinstall >> >>> > "Development Tools"` might help install this and other related >> libraries >> >>> > you will probably find yourself needing. >> >>> > It might be useful to make a note of this in the README as well on >> what >> >>> the >> >>> > expected/required native libraries are. >> >>> > >> >>> > On Fri, Jul 1, 2016 at 4:31 PM, Manoj Kumar < >> >>> > manojkumarsivaraj...@gmail.com > >> >>> > wrote: >> >>> > >> >>> > > Hi, >> >>> > > >> >>> > > I am trying to install Arrow using the following instructions. >> >>> > > >> >>> > > ./cpp/thirdparty/download_thirdparty.sh >> >>> > > ./cpp/thirdparty/build_thirdparty.sh >> >>> > > >> >>> > > >> >>> > > It fails with this error: >> >>> > > >> >>> > > + cmake -DCMAKE_BUILD_TYPE=Release >> >>> > > -DCMAKE_INSTALL_PREFIX=/home/manoj/arrow/cpp/thirdparty/installed >> >>> > > '-DCMAKE_CXX_FLAGS=-fPIC --std=c++0x' . >> >>> > > -- git Version: v0.0.0-dirty >> >>> > > -- Version: 0.0.0 >> >>> > > -- Performing Test HAVE_STD_REGEX >> >>> > > -- Performing Test HAVE_STD_REGEX -- failed to compile >> >>> > > -- Performing Test HAVE_GNU_POSIX_REGEX >> >>> > > -- Performing Test HAVE_GNU_POSIX_REGEX -- failed to compile >> >>> > > -- Performing Test HAVE_POSIX_REGEX >> >>> > > -- Performing Test HAVE_POSIX_REGEX -- failed to compile >> >>> > > -- Performing Test HAVE_STEADY_CLOCK >> >>> > > -- Performing Test HAVE_STEADY_CLOCK -- failed to compile >> >>> > > CMake Error at src/CMakeLists.txt:17 (message): >> >>> > > Failed to determine the source files for the regular expression >> >>> > backend. >> >>> > > >> >>> > > >> >>> > > I use CentOS version: 6.7 >> >>> > > >> >>> > > Any help to fix this, would be appreciated. Thanks! >> >>> > > >> >>> > > -- >> >>> > > Manoj, >> >>> > > http://github.com/MechCoder >> >>> > > >> >>> > >> >>> > >> >>> > >> >>> > -- >> >>> > Cell : 425-233-8271 >> >>> > Twitter: https://twitter.com/holdenkarau >> >>> > >> >>> >> >> >> >> >> >> >> >> -- >> >> Manoj, >> >> http://github.com/MechCoder >> > > > > -- > Manoj, > http://github.com/MechCoder > -- Manoj, http://github.com/MechCoder
Re: Installing Arrow
Hi Wes, I updated Boost and it works now. I'll send a pull request to make a note for that soon. On Tue, Jul 5, 2016 at 3:30 PM, Wes McKinney wrote: > Oops, had a keyboarding failure. I got cmake 2.8.12.2 via yum on > CentOS 6 after installing the devtoolset. > > On Tue, Jul 5, 2016 at 3:29 PM, Wes McKinney wrote: > > hi Manoj, > > > > What is the output of > > > > cmake --version > > > > I installed the RHEL devtoolset using the set of commands in > > > > > https://github.com/conda-forge/docker-images/blob/master/linux-anvil/Dockerfile > > > > I had to run > > > > scl enable devtoolset-2 bash > > x > > and am able to build the thirdparty on CentOS 6.8. > > > > > https://github.com/conda-forge/docker-images/blob/master/linux-anvil/Dockerfile > > > > I hit another bug due to old boost-devel in CentOS, so I'd need to > > look more into that > > > > - Wes > > > > On Tue, Jul 5, 2016 at 10:40 AM, Manoj Kumar > > wrote: > >> Hi all, > >> > >> Thanks for the tips. I upgraded gcc to support C++11. > >> > >> g++ --version > >> g++ (GCC) 4.8.2 20140120 (Red Hat 4.8.2-15) > >> Copyright (C) 2013 Free Software Foundation, Inc. > >> > >> But I still get the error. Here is the full traceback: > >> > >> https://gist.github.com/MechCoder/2f97e53c35d36bb132d118f9abc7a255 > >> > >> Any help would be appreciated. > >> > >> > >> > >> On Fri, Jul 1, 2016 at 4:59 PM, Micah Kornfield > >> wrote: > >> > >>> I think if c++11 was used to compile the library it might also fix the > >>> issue. > >>> > >>> On Friday, July 1, 2016, Holden Karau wrote: > >>> > >>> > So doing a bit of hunting it seems like this might be coming since > your > >>> > missing some expected build libraries, namely you don't have any of > the > >>> > expected regex libraries. I'm used to working on ubuntu/debian > derived > >>> > systems but it seems like installing something like `yum groupinstall > >>> > "Development Tools"` might help install this and other related > libraries > >>> > you will probably find yourself needing. > >>> > It might be useful to make a note of this in the README as well on > what > >>> the > >>> > expected/required native libraries are. > >>> > > >>> > On Fri, Jul 1, 2016 at 4:31 PM, Manoj Kumar < > >>> > manojkumarsivaraj...@gmail.com > > >>> > wrote: > >>> > > >>> > > Hi, > >>> > > > >>> > > I am trying to install Arrow using the following instructions. > >>> > > > >>> > > ./cpp/thirdparty/download_thirdparty.sh > >>> > > ./cpp/thirdparty/build_thirdparty.sh > >>> > > > >>> > > > >>> > > It fails with this error: > >>> > > > >>> > > + cmake -DCMAKE_BUILD_TYPE=Release > >>> > > -DCMAKE_INSTALL_PREFIX=/home/manoj/arrow/cpp/thirdparty/installed > >>> > > '-DCMAKE_CXX_FLAGS=-fPIC --std=c++0x' . > >>> > > -- git Version: v0.0.0-dirty > >>> > > -- Version: 0.0.0 > >>> > > -- Performing Test HAVE_STD_REGEX > >>> > > -- Performing Test HAVE_STD_REGEX -- failed to compile > >>> > > -- Performing Test HAVE_GNU_POSIX_REGEX > >>> > > -- Performing Test HAVE_GNU_POSIX_REGEX -- failed to compile > >>> > > -- Performing Test HAVE_POSIX_REGEX > >>> > > -- Performing Test HAVE_POSIX_REGEX -- failed to compile > >>> > > -- Performing Test HAVE_STEADY_CLOCK > >>> > > -- Performing Test HAVE_STEADY_CLOCK -- failed to compile > >>> > > CMake Error at src/CMakeLists.txt:17 (message): > >>> > > Failed to determine the source files for the regular expression > >>> > backend. > >>> > > > >>> > > > >>> > > I use CentOS version: 6.7 > >>> > > > >>> > > Any help to fix this, would be appreciated. Thanks! > >>> > > > >>> > > -- > >>> > > Manoj, > >>> > > http://github.com/MechCoder > >>> > > > >>> > > >>> > > >>> > > >>> > -- > >>> > Cell : 425-233-8271 > >>> > Twitter: https://twitter.com/holdenkarau > >>> > > >>> > >> > >> > >> > >> -- > >> Manoj, > >> http://github.com/MechCoder > -- Manoj, http://github.com/MechCoder
Re: Installing Arrow
Hi all, Thanks for the tips. I upgraded gcc to support C++11. g++ --version g++ (GCC) 4.8.2 20140120 (Red Hat 4.8.2-15) Copyright (C) 2013 Free Software Foundation, Inc. But I still get the error. Here is the full traceback: https://gist.github.com/MechCoder/2f97e53c35d36bb132d118f9abc7a255 Any help would be appreciated. On Fri, Jul 1, 2016 at 4:59 PM, Micah Kornfield wrote: > I think if c++11 was used to compile the library it might also fix the > issue. > > On Friday, July 1, 2016, Holden Karau wrote: > > > So doing a bit of hunting it seems like this might be coming since your > > missing some expected build libraries, namely you don't have any of the > > expected regex libraries. I'm used to working on ubuntu/debian derived > > systems but it seems like installing something like `yum groupinstall > > "Development Tools"` might help install this and other related libraries > > you will probably find yourself needing. > > It might be useful to make a note of this in the README as well on what > the > > expected/required native libraries are. > > > > On Fri, Jul 1, 2016 at 4:31 PM, Manoj Kumar < > > manojkumarsivaraj...@gmail.com > > > wrote: > > > > > Hi, > > > > > > I am trying to install Arrow using the following instructions. > > > > > > ./cpp/thirdparty/download_thirdparty.sh > > > ./cpp/thirdparty/build_thirdparty.sh > > > > > > > > > It fails with this error: > > > > > > + cmake -DCMAKE_BUILD_TYPE=Release > > > -DCMAKE_INSTALL_PREFIX=/home/manoj/arrow/cpp/thirdparty/installed > > > '-DCMAKE_CXX_FLAGS=-fPIC --std=c++0x' . > > > -- git Version: v0.0.0-dirty > > > -- Version: 0.0.0 > > > -- Performing Test HAVE_STD_REGEX > > > -- Performing Test HAVE_STD_REGEX -- failed to compile > > > -- Performing Test HAVE_GNU_POSIX_REGEX > > > -- Performing Test HAVE_GNU_POSIX_REGEX -- failed to compile > > > -- Performing Test HAVE_POSIX_REGEX > > > -- Performing Test HAVE_POSIX_REGEX -- failed to compile > > > -- Performing Test HAVE_STEADY_CLOCK > > > -- Performing Test HAVE_STEADY_CLOCK -- failed to compile > > > CMake Error at src/CMakeLists.txt:17 (message): > > > Failed to determine the source files for the regular expression > > backend. > > > > > > > > > I use CentOS version: 6.7 > > > > > > Any help to fix this, would be appreciated. Thanks! > > > > > > -- > > > Manoj, > > > http://github.com/MechCoder > > > > > > > > > > > -- > > Cell : 425-233-8271 > > Twitter: https://twitter.com/holdenkarau > > > -- Manoj, http://github.com/MechCoder
Installing Arrow
Hi, I am trying to install Arrow using the following instructions. ./cpp/thirdparty/download_thirdparty.sh ./cpp/thirdparty/build_thirdparty.sh It fails with this error: + cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/home/manoj/arrow/cpp/thirdparty/installed '-DCMAKE_CXX_FLAGS=-fPIC --std=c++0x' . -- git Version: v0.0.0-dirty -- Version: 0.0.0 -- Performing Test HAVE_STD_REGEX -- Performing Test HAVE_STD_REGEX -- failed to compile -- Performing Test HAVE_GNU_POSIX_REGEX -- Performing Test HAVE_GNU_POSIX_REGEX -- failed to compile -- Performing Test HAVE_POSIX_REGEX -- Performing Test HAVE_POSIX_REGEX -- failed to compile -- Performing Test HAVE_STEADY_CLOCK -- Performing Test HAVE_STEADY_CLOCK -- failed to compile CMake Error at src/CMakeLists.txt:17 (message): Failed to determine the source files for the regular expression backend. I use CentOS version: 6.7 Any help to fix this, would be appreciated. Thanks! -- Manoj, http://github.com/MechCoder