Re: Alluxio cache read support

2022-09-06 Thread Manoj Kumar
Hi  Sutou Kouhei/Team

*[Background]*

Working on intel gazelle_plugin
<https://github.com/oap-project/gazelle_plugin>,
It's a C++ based backend with an arrow compute engine for spark.
Now during scan i.e reading data from HDFS/Cloud currently we are using
cloud/hdfs APIs as mentioned above.
But now we have Alluxio Cache
<https://docs.alluxio.io/ee/user/stable/en/core-services/Caching.html> in
between for fast data access.

*[Problem]*

HDFS/Cloud > Alluxio > arrow FS api ---> arrow parquet scan

*[Need help]*

Below connection
[  Alluxio-> arrow  FS api ]


On Tue, 6 Sept 2022 at 02:17, Sutou Kouhei  wrote:

> Hi,
>
> Could you try our HDFS support?
>
> *
> https://arrow.apache.org/docs/cpp/dataset.html#reading-from-cloud-storage
> *
> https://arrow.apache.org/docs/cpp/api/filesystem.html#_CPPv4N5arrow2fs16HadoopFileSystemE
>
> (You're using Apache Arrow C++, right?)
>
>
> Thanks,
> --
> kou
>
> In 
>   "Alluxio cache read support" on Mon, 5 Sep 2022 16:58:57 +0530,
>   Manoj Kumar  wrote:
>
> > Hi Team,
> >
> > Anyone know how to access HDFS/cloud FS backend by Alluxio via the arrow
> > filesystem ?
>


Alluxio cache read support

2022-09-05 Thread Manoj Kumar
Hi Team,

Anyone know how to access HDFS/cloud FS backend by Alluxio via the arrow
filesystem ?


Re: HDFS ORC to Arrow Dataset

2021-09-09 Thread Manoj Kumar
On further analysis :-

==114164== Process terminating with default action of signal 6 (SIGABRT)
==114164==at 0x4AD118B: raise (raise.c:51)
==114164==by 0x4AB092D: abort (abort.c:100)
==114164==by 0x598D768: os::abort(bool) (in
/home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so)
==114164==by 0x5B52802: VMError::report_and_die() (in
/home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so)
==114164==by 0x59979F4: JVM_handle_linux_signal (in
/home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so)
==114164==by 0x598A8B7: signalHandler(int, siginfo*, void*) (in
/home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so)
==114164==by 0x485F3BF: ??? (in /usr/lib/x86_64-linux-gnu/
libpthread-2.31.so)
==114164==by 0x5949C26: Monitor::ILock(Thread*) [clone .part.2] (in
/home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so)
==114164==by 0x594B50A: Monitor::lock_without_safepoint_check() (in
/home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so)
==114164==by 0x5B59660: VM_Exit::wait_if_vm_exited() (in
/home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so)
==114164==by 0x574137C: jni_DetachCurrentThread (in
/home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so)
==114164==by 0x4140AA4E: hdfsThreadDestructor (thread_local_storage.c:53

It turned out that the issue was in libhdfs, so I fixed that.

Now ORC JNI also works fine

There are many features missing in ORC-JN like reading full split or index
based reading etc etc
Do we have any plan to support those ?


On Wed, 8 Sept 2021 at 22:06, Manoj Kumar  wrote:

> Hi Wes,
>
> Thanks,
>
> *[ Part 1 ]*
> *C++ HDFS/ORC  [Completed]*
> Steps which I followed :
> 1) arrow::fs::HadoopFileSystem --> create a hadoop FS
> 2) std::shared_ptr -->then create a stream
> 3) Pass that stream to adapters::orc::ORCFileReader
>
> *[Part 2 ]*
> *C++ HDFS/ORC via Java JNI [Partial Completed]*
> *Follow same approach in orc.jni_wrapper*
> 1) arrow::fs::HadoopFileSystem --> create a hadoop FS
> 2) std::shared_ptr -->then create a stream
> 3) Pass that stream to adapters::orc::ORCFileReader
>
> **
>  std::unique_ptr reader;
>  arrow::Status ret;
>  if (path.find("hdfs://") == 0) {
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *   arrow::fs::HdfsOptions options_;options_ =
> *arrow::fs::HdfsOptions::FromUri(path);auto _fsRes =
> arrow::fs::HadoopFileSystem::Make(options_);if (!_fsRes.ok()) {
> std::cerr<< "HadoopFileSystem::Make failed, it
> is possible when we don't have "   "proper driver on
> this node, err msg is "<< _fsRes.status().ToString();
>   } _fs = *_fsRes;auto _stream =
> *_fs->OpenInputFile(path);hadoop_fs_holder_.Insert(_fs); //global
> holder in arrow::jni::ConcurrentMap, cleared during unload ret =
> ORCFileReader::Open(* *_stream*
> *,arrow::default_memory_pool(), &reader);*
>
>
>
> *  if (!ret.ok()) {env->ThrowNew(io_exception_class,
> std::string("Failed open file" + path).c_str());}*
>
> *   return
> orc_reader_holder_.Insert(std::shared_ptr(reader.release()));*
> *}*
>
>
> JNI also works fine, but at the end of application, I am getting
> segmentation fault.
>
> *Do you have any idea about , looks like some issue with libhdfs
> connection close or cleanup ?*
>
> *stack trace:*
>   /tmp/tmp3973555041947319188libarrow_orc_jni.so : ()+0xb8b1a3
>   /lib/x86_64-linux-gnu/libpthread.so.0 : ()+0x153c0
>   /lib/x86_64-linux-gnu/libc.so.6 : gsignal()+0xcb
>   /lib/x86_64-linux-gnu/libc.so.6 : abort()+0x12b
>
> /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so
> : ()+0x90e769
>
> /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so
> : ()+0xad3803
>
> /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so
> : JVM_handle_linux_signal()+0x1a5
>
> /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so
> : ()+0x90b8b8
>   /lib/x86_64-linux-gnu/libpthread.so.0 : ()+0x153c0
>
> /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so
> : ()+0x8cac27
>
> /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so
> : ()+0x8cc50b
>
> /home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so

Re: HDFS ORC to Arrow Dataset

2021-09-08 Thread Manoj Kumar
Hi Wes,

Thanks,

*[ Part 1 ]*
*C++ HDFS/ORC  [Completed]*
Steps which I followed :
1) arrow::fs::HadoopFileSystem --> create a hadoop FS
2) std::shared_ptr -->then create a stream
3) Pass that stream to adapters::orc::ORCFileReader

*[Part 2 ]*
*C++ HDFS/ORC via Java JNI [Partial Completed]*
*Follow same approach in orc.jni_wrapper*
1) arrow::fs::HadoopFileSystem --> create a hadoop FS
2) std::shared_ptr -->then create a stream
3) Pass that stream to adapters::orc::ORCFileReader

**
 std::unique_ptr reader;
 arrow::Status ret;
 if (path.find("hdfs://") == 0) {














*   arrow::fs::HdfsOptions options_;options_ =
*arrow::fs::HdfsOptions::FromUri(path);auto _fsRes =
arrow::fs::HadoopFileSystem::Make(options_);if (!_fsRes.ok()) {
std::cerr<< "HadoopFileSystem::Make failed, it
is possible when we don't have "   "proper driver on
this node, err msg is "<< _fsRes.status().ToString();
  } _fs = *_fsRes;auto _stream =
*_fs->OpenInputFile(path);hadoop_fs_holder_.Insert(_fs); //global
holder in arrow::jni::ConcurrentMap, cleared during unload ret =
ORCFileReader::Open(* *_stream*
*,arrow::default_memory_pool(), &reader);*



*  if (!ret.ok()) {env->ThrowNew(io_exception_class,
std::string("Failed open file" + path).c_str());}*

*   return
orc_reader_holder_.Insert(std::shared_ptr(reader.release()));*
*}*


JNI also works fine, but at the end of application, I am getting
segmentation fault.

*Do you have any idea about , looks like some issue with libhdfs connection
close or cleanup ?*

*stack trace:*
  /tmp/tmp3973555041947319188libarrow_orc_jni.so : ()+0xb8b1a3
  /lib/x86_64-linux-gnu/libpthread.so.0 : ()+0x153c0
  /lib/x86_64-linux-gnu/libc.so.6 : gsignal()+0xcb
  /lib/x86_64-linux-gnu/libc.so.6 : abort()+0x12b

/home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so
: ()+0x90e769

/home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so
: ()+0xad3803

/home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so
: JVM_handle_linux_signal()+0x1a5

/home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so
: ()+0x90b8b8
  /lib/x86_64-linux-gnu/libpthread.so.0 : ()+0x153c0

/home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so
: ()+0x8cac27

/home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so
: ()+0x8cc50b

/home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so
: ()+0xada661

/home/legion/ha_devel/hadoop-ecosystem-3x/jdk1.8.0_201/jre/lib/amd64/server/libjvm.so
: ()+0x6c237d
*
/home/legion/ha_devel/hadoop-ecosystem-3x/hadoop-3.1.1/lib/native/libhdfs.so
: ()+0xaa4f*
  /lib/x86_64-linux-gnu/libpthread.so.0 : ()+0x85a1
  /lib/x86_64-linux-gnu/libpthread.so.0 : ()+0x962a
  /lib/x86_64-linux-gnu/libc.so.6 : clone()+0x43



On Wed, 8 Sept 2021 at 04:07, Weston Pace  wrote:

> I'll just add that a PR in in progress (thanks Joris!) for adding this
> adapter: https://github.com/apache/arrow/pull/10991
>
> On Tue, Sep 7, 2021 at 12:05 PM Wes McKinney  wrote:
> >
> > I'm missing context but if you're talking about C++/Python, we are
> > currently missing a wrapper interface to the ORC reader in the Arrow
> > datasets library
> >
> > https://github.com/apache/arrow/tree/master/cpp/src/arrow/dataset
> >
> > We have CSV, Arrow (IPC), and Parquet interfaces.
> >
> > But we have an HDFS filesystem implementation and an ORC reader
> > implementation, so mechanically all of the pieces are there but need
> > to be connected together.
> >
> > Thanks,
> > Wes
> >
> > On Tue, Sep 7, 2021 at 8:22 AM Manoj Kumar  wrote:
> > >
> > > Hi Dev-Community,
> > >
> > > Anyone can help me to guide how to read ORC directly from HDFS to an
> > > arrow dataset.
> > >
> > > Thanks
> > > Manoj
>


Fwd: HDFS ORC to Arrow Dataset

2021-09-07 Thread Manoj Kumar
Hi Dev-Community,

Anyone can help me to guide how to read ORC directly from HDFS to an
arrow dataset.

Thanks
Manoj


Re: Installing Arrow

2016-07-13 Thread Manoj Kumar
I stringed together a set of installation instructions to help future
Python developers.

The PR is here https://github.com/apache/arrow/pull/105

On Wed, Jul 6, 2016 at 2:28 PM, Wes McKinney  wrote:

> You can look at the Travis CI scripts to see the build procedure for
> each component:
>
> https://github.com/apache/arrow/tree/master/ci
>
> While developing, I usually install things like parquet-cpp in
> $HOME/local and set $PARQUET_HOME (or any other dependency env
> variables) to that directory.
>
> We haven't put much effort into dealing with the optional dependencies
> of pyarrow, or even deciding what is optional vs required. So for now
> installing parquet-cpp is needed for building the Python extensions.
>
> - Wes
>
> On Wed, Jul 6, 2016 at 11:05 AM, Manoj Kumar
>  wrote:
> > Hi,
> >
> > I have problems in installing the Python port of arrow. It seems that it
> > requires parquet-cpp.
> >
> > So I installed parquet-cpp from source using the following instructions
> > here (https://github.com/apache/parquet-cpp)
> >
> > Doing "cmake ." and "make" from the parquet-cpp root directory, seems to
> > work hinting that parquet-cpp seems to be installed succesfully.
> >
> > However, doing "sudo python3 setup.py install" from the python root
> > directory fails with the error,
> >
> > "Could not find the Parquet library.  Looked in system search paths."
> >
> > What do I need to set "PARQUET_HOME" to be to install the python port?
> >
> > Thanks!
> >
> >
> >
> >
> >
> > On Tue, Jul 5, 2016 at 3:47 PM, Manoj Kumar <
> manojkumarsivaraj...@gmail.com>
> > wrote:
> >
> >> Hi Wes,
> >>
> >> I updated Boost and it works now.
> >>
> >> I'll send a pull request to make a note for that soon.
> >>
> >> On Tue, Jul 5, 2016 at 3:30 PM, Wes McKinney 
> wrote:
> >>
> >>> Oops, had a keyboarding failure. I got cmake 2.8.12.2 via yum on
> >>> CentOS 6 after installing the devtoolset.
> >>>
> >>> On Tue, Jul 5, 2016 at 3:29 PM, Wes McKinney 
> wrote:
> >>> > hi Manoj,
> >>> >
> >>> > What is the output of
> >>> >
> >>> > cmake --version
> >>> >
> >>> > I installed the RHEL devtoolset using the set of commands in
> >>> >
> >>> >
> >>>
> https://github.com/conda-forge/docker-images/blob/master/linux-anvil/Dockerfile
> >>> >
> >>> > I had to run
> >>> >
> >>> > scl enable devtoolset-2 bash
> >>> > x
> >>> > and am able to build the thirdparty on CentOS 6.8.
> >>> >
> >>> >
> >>>
> https://github.com/conda-forge/docker-images/blob/master/linux-anvil/Dockerfile
> >>> >
> >>> > I hit another bug due to old boost-devel in CentOS, so I'd need to
> >>> > look more into that
> >>> >
> >>> > - Wes
> >>> >
> >>> > On Tue, Jul 5, 2016 at 10:40 AM, Manoj Kumar
> >>> >  wrote:
> >>> >> Hi all,
> >>> >>
> >>> >> Thanks for the tips. I upgraded gcc to support C++11.
> >>> >>
> >>> >> g++ --version
> >>> >> g++ (GCC) 4.8.2 20140120 (Red Hat 4.8.2-15)
> >>> >> Copyright (C) 2013 Free Software Foundation, Inc.
> >>> >>
> >>> >> But I still get the error. Here is the full traceback:
> >>> >>
> >>> >> https://gist.github.com/MechCoder/2f97e53c35d36bb132d118f9abc7a255
> >>> >>
> >>> >> Any help would be appreciated.
> >>> >>
> >>> >>
> >>> >>
> >>> >> On Fri, Jul 1, 2016 at 4:59 PM, Micah Kornfield <
> emkornfi...@gmail.com
> >>> >
> >>> >> wrote:
> >>> >>
> >>> >>> I think if c++11 was used to compile the library it might also fix
> the
> >>> >>> issue.
> >>> >>>
> >>> >>> On Friday, July 1, 2016, Holden Karau 
> wrote:
> >>> >>>
> >>> >>> > So doing a bit of hunting it seems like this might be coming
> since
> >>> your
> >>> >>> > missing some expected build libraries, namely you don't have any

[jira] [Created] (ARROW-240) Installation instructions for pyarrow

2016-07-13 Thread Manoj Kumar (JIRA)
Manoj Kumar created ARROW-240:
-

 Summary: Installation instructions for pyarrow
 Key: ARROW-240
 URL: https://issues.apache.org/jira/browse/ARROW-240
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Manoj Kumar


It would be great to have updated installations to build the pyarrow project.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Installing Arrow

2016-07-06 Thread Manoj Kumar
Hi,

I have problems in installing the Python port of arrow. It seems that it
requires parquet-cpp.

So I installed parquet-cpp from source using the following instructions
here (https://github.com/apache/parquet-cpp)

Doing "cmake ." and "make" from the parquet-cpp root directory, seems to
work hinting that parquet-cpp seems to be installed succesfully.

However, doing "sudo python3 setup.py install" from the python root
directory fails with the error,

"Could not find the Parquet library.  Looked in system search paths."

What do I need to set "PARQUET_HOME" to be to install the python port?

Thanks!





On Tue, Jul 5, 2016 at 3:47 PM, Manoj Kumar 
wrote:

> Hi Wes,
>
> I updated Boost and it works now.
>
> I'll send a pull request to make a note for that soon.
>
> On Tue, Jul 5, 2016 at 3:30 PM, Wes McKinney  wrote:
>
>> Oops, had a keyboarding failure. I got cmake 2.8.12.2 via yum on
>> CentOS 6 after installing the devtoolset.
>>
>> On Tue, Jul 5, 2016 at 3:29 PM, Wes McKinney  wrote:
>> > hi Manoj,
>> >
>> > What is the output of
>> >
>> > cmake --version
>> >
>> > I installed the RHEL devtoolset using the set of commands in
>> >
>> >
>> https://github.com/conda-forge/docker-images/blob/master/linux-anvil/Dockerfile
>> >
>> > I had to run
>> >
>> > scl enable devtoolset-2 bash
>> > x
>> > and am able to build the thirdparty on CentOS 6.8.
>> >
>> >
>> https://github.com/conda-forge/docker-images/blob/master/linux-anvil/Dockerfile
>> >
>> > I hit another bug due to old boost-devel in CentOS, so I'd need to
>> > look more into that
>> >
>> > - Wes
>> >
>> > On Tue, Jul 5, 2016 at 10:40 AM, Manoj Kumar
>> >  wrote:
>> >> Hi all,
>> >>
>> >> Thanks for the tips. I upgraded gcc to support C++11.
>> >>
>> >> g++ --version
>> >> g++ (GCC) 4.8.2 20140120 (Red Hat 4.8.2-15)
>> >> Copyright (C) 2013 Free Software Foundation, Inc.
>> >>
>> >> But I still get the error. Here is the full traceback:
>> >>
>> >> https://gist.github.com/MechCoder/2f97e53c35d36bb132d118f9abc7a255
>> >>
>> >> Any help would be appreciated.
>> >>
>> >>
>> >>
>> >> On Fri, Jul 1, 2016 at 4:59 PM, Micah Kornfield > >
>> >> wrote:
>> >>
>> >>> I think if c++11 was used to compile the library it might also fix the
>> >>> issue.
>> >>>
>> >>> On Friday, July 1, 2016, Holden Karau  wrote:
>> >>>
>> >>> > So doing a bit of hunting it seems like this might be coming since
>> your
>> >>> > missing some expected build libraries, namely you don't have any of
>> the
>> >>> > expected regex libraries. I'm used to working on ubuntu/debian
>> derived
>> >>> > systems but it seems like installing something like `yum
>> groupinstall
>> >>> > "Development Tools"` might help install this and other related
>> libraries
>> >>> > you will probably find yourself needing.
>> >>> > It might be useful to make a note of this in the README as well on
>> what
>> >>> the
>> >>> > expected/required native libraries are.
>> >>> >
>> >>> > On Fri, Jul 1, 2016 at 4:31 PM, Manoj Kumar <
>> >>> > manojkumarsivaraj...@gmail.com >
>> >>> > wrote:
>> >>> >
>> >>> > > Hi,
>> >>> > >
>> >>> > > I am trying to install Arrow using the following instructions.
>> >>> > >
>> >>> > > ./cpp/thirdparty/download_thirdparty.sh
>> >>> > > ./cpp/thirdparty/build_thirdparty.sh
>> >>> > >
>> >>> > >
>> >>> > > It fails with this error:
>> >>> > >
>> >>> > > + cmake -DCMAKE_BUILD_TYPE=Release
>> >>> > > -DCMAKE_INSTALL_PREFIX=/home/manoj/arrow/cpp/thirdparty/installed
>> >>> > > '-DCMAKE_CXX_FLAGS=-fPIC --std=c++0x' .
>> >>> > > -- git Version: v0.0.0-dirty
>> >>> > > -- Version: 0.0.0
>> >>> > > -- Performing Test HAVE_STD_REGEX
>> >>> > > -- Performing Test HAVE_STD_REGEX -- failed to compile
>> >>> > > -- Performing Test HAVE_GNU_POSIX_REGEX
>> >>> > > -- Performing Test HAVE_GNU_POSIX_REGEX -- failed to compile
>> >>> > > -- Performing Test HAVE_POSIX_REGEX
>> >>> > > -- Performing Test HAVE_POSIX_REGEX -- failed to compile
>> >>> > > -- Performing Test HAVE_STEADY_CLOCK
>> >>> > > -- Performing Test HAVE_STEADY_CLOCK -- failed to compile
>> >>> > > CMake Error at src/CMakeLists.txt:17 (message):
>> >>> > >   Failed to determine the source files for the regular expression
>> >>> > backend.
>> >>> > >
>> >>> > >
>> >>> > > I use CentOS version: 6.7
>> >>> > >
>> >>> > > Any help to fix this, would be appreciated. Thanks!
>> >>> > >
>> >>> > > --
>> >>> > > Manoj,
>> >>> > > http://github.com/MechCoder
>> >>> > >
>> >>> >
>> >>> >
>> >>> >
>> >>> > --
>> >>> > Cell : 425-233-8271
>> >>> > Twitter: https://twitter.com/holdenkarau
>> >>> >
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Manoj,
>> >> http://github.com/MechCoder
>>
>
>
>
> --
> Manoj,
> http://github.com/MechCoder
>



-- 
Manoj,
http://github.com/MechCoder


Re: Installing Arrow

2016-07-05 Thread Manoj Kumar
Hi Wes,

I updated Boost and it works now.

I'll send a pull request to make a note for that soon.

On Tue, Jul 5, 2016 at 3:30 PM, Wes McKinney  wrote:

> Oops, had a keyboarding failure. I got cmake 2.8.12.2 via yum on
> CentOS 6 after installing the devtoolset.
>
> On Tue, Jul 5, 2016 at 3:29 PM, Wes McKinney  wrote:
> > hi Manoj,
> >
> > What is the output of
> >
> > cmake --version
> >
> > I installed the RHEL devtoolset using the set of commands in
> >
> >
> https://github.com/conda-forge/docker-images/blob/master/linux-anvil/Dockerfile
> >
> > I had to run
> >
> > scl enable devtoolset-2 bash
> > x
> > and am able to build the thirdparty on CentOS 6.8.
> >
> >
> https://github.com/conda-forge/docker-images/blob/master/linux-anvil/Dockerfile
> >
> > I hit another bug due to old boost-devel in CentOS, so I'd need to
> > look more into that
> >
> > - Wes
> >
> > On Tue, Jul 5, 2016 at 10:40 AM, Manoj Kumar
> >  wrote:
> >> Hi all,
> >>
> >> Thanks for the tips. I upgraded gcc to support C++11.
> >>
> >> g++ --version
> >> g++ (GCC) 4.8.2 20140120 (Red Hat 4.8.2-15)
> >> Copyright (C) 2013 Free Software Foundation, Inc.
> >>
> >> But I still get the error. Here is the full traceback:
> >>
> >> https://gist.github.com/MechCoder/2f97e53c35d36bb132d118f9abc7a255
> >>
> >> Any help would be appreciated.
> >>
> >>
> >>
> >> On Fri, Jul 1, 2016 at 4:59 PM, Micah Kornfield 
> >> wrote:
> >>
> >>> I think if c++11 was used to compile the library it might also fix the
> >>> issue.
> >>>
> >>> On Friday, July 1, 2016, Holden Karau  wrote:
> >>>
> >>> > So doing a bit of hunting it seems like this might be coming since
> your
> >>> > missing some expected build libraries, namely you don't have any of
> the
> >>> > expected regex libraries. I'm used to working on ubuntu/debian
> derived
> >>> > systems but it seems like installing something like `yum groupinstall
> >>> > "Development Tools"` might help install this and other related
> libraries
> >>> > you will probably find yourself needing.
> >>> > It might be useful to make a note of this in the README as well on
> what
> >>> the
> >>> > expected/required native libraries are.
> >>> >
> >>> > On Fri, Jul 1, 2016 at 4:31 PM, Manoj Kumar <
> >>> > manojkumarsivaraj...@gmail.com >
> >>> > wrote:
> >>> >
> >>> > > Hi,
> >>> > >
> >>> > > I am trying to install Arrow using the following instructions.
> >>> > >
> >>> > > ./cpp/thirdparty/download_thirdparty.sh
> >>> > > ./cpp/thirdparty/build_thirdparty.sh
> >>> > >
> >>> > >
> >>> > > It fails with this error:
> >>> > >
> >>> > > + cmake -DCMAKE_BUILD_TYPE=Release
> >>> > > -DCMAKE_INSTALL_PREFIX=/home/manoj/arrow/cpp/thirdparty/installed
> >>> > > '-DCMAKE_CXX_FLAGS=-fPIC --std=c++0x' .
> >>> > > -- git Version: v0.0.0-dirty
> >>> > > -- Version: 0.0.0
> >>> > > -- Performing Test HAVE_STD_REGEX
> >>> > > -- Performing Test HAVE_STD_REGEX -- failed to compile
> >>> > > -- Performing Test HAVE_GNU_POSIX_REGEX
> >>> > > -- Performing Test HAVE_GNU_POSIX_REGEX -- failed to compile
> >>> > > -- Performing Test HAVE_POSIX_REGEX
> >>> > > -- Performing Test HAVE_POSIX_REGEX -- failed to compile
> >>> > > -- Performing Test HAVE_STEADY_CLOCK
> >>> > > -- Performing Test HAVE_STEADY_CLOCK -- failed to compile
> >>> > > CMake Error at src/CMakeLists.txt:17 (message):
> >>> > >   Failed to determine the source files for the regular expression
> >>> > backend.
> >>> > >
> >>> > >
> >>> > > I use CentOS version: 6.7
> >>> > >
> >>> > > Any help to fix this, would be appreciated. Thanks!
> >>> > >
> >>> > > --
> >>> > > Manoj,
> >>> > > http://github.com/MechCoder
> >>> > >
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > Cell : 425-233-8271
> >>> > Twitter: https://twitter.com/holdenkarau
> >>> >
> >>>
> >>
> >>
> >>
> >> --
> >> Manoj,
> >> http://github.com/MechCoder
>



-- 
Manoj,
http://github.com/MechCoder


Re: Installing Arrow

2016-07-05 Thread Manoj Kumar
Hi all,

Thanks for the tips. I upgraded gcc to support C++11.

g++ --version
g++ (GCC) 4.8.2 20140120 (Red Hat 4.8.2-15)
Copyright (C) 2013 Free Software Foundation, Inc.

But I still get the error. Here is the full traceback:

https://gist.github.com/MechCoder/2f97e53c35d36bb132d118f9abc7a255

Any help would be appreciated.



On Fri, Jul 1, 2016 at 4:59 PM, Micah Kornfield 
wrote:

> I think if c++11 was used to compile the library it might also fix the
> issue.
>
> On Friday, July 1, 2016, Holden Karau  wrote:
>
> > So doing a bit of hunting it seems like this might be coming since your
> > missing some expected build libraries, namely you don't have any of the
> > expected regex libraries. I'm used to working on ubuntu/debian derived
> > systems but it seems like installing something like `yum groupinstall
> > "Development Tools"` might help install this and other related libraries
> > you will probably find yourself needing.
> > It might be useful to make a note of this in the README as well on what
> the
> > expected/required native libraries are.
> >
> > On Fri, Jul 1, 2016 at 4:31 PM, Manoj Kumar <
> > manojkumarsivaraj...@gmail.com >
> > wrote:
> >
> > > Hi,
> > >
> > > I am trying to install Arrow using the following instructions.
> > >
> > > ./cpp/thirdparty/download_thirdparty.sh
> > > ./cpp/thirdparty/build_thirdparty.sh
> > >
> > >
> > > It fails with this error:
> > >
> > > + cmake -DCMAKE_BUILD_TYPE=Release
> > > -DCMAKE_INSTALL_PREFIX=/home/manoj/arrow/cpp/thirdparty/installed
> > > '-DCMAKE_CXX_FLAGS=-fPIC --std=c++0x' .
> > > -- git Version: v0.0.0-dirty
> > > -- Version: 0.0.0
> > > -- Performing Test HAVE_STD_REGEX
> > > -- Performing Test HAVE_STD_REGEX -- failed to compile
> > > -- Performing Test HAVE_GNU_POSIX_REGEX
> > > -- Performing Test HAVE_GNU_POSIX_REGEX -- failed to compile
> > > -- Performing Test HAVE_POSIX_REGEX
> > > -- Performing Test HAVE_POSIX_REGEX -- failed to compile
> > > -- Performing Test HAVE_STEADY_CLOCK
> > > -- Performing Test HAVE_STEADY_CLOCK -- failed to compile
> > > CMake Error at src/CMakeLists.txt:17 (message):
> > >   Failed to determine the source files for the regular expression
> > backend.
> > >
> > >
> > > I use CentOS version: 6.7
> > >
> > > Any help to fix this, would be appreciated. Thanks!
> > >
> > > --
> > > Manoj,
> > > http://github.com/MechCoder
> > >
> >
> >
> >
> > --
> > Cell : 425-233-8271
> > Twitter: https://twitter.com/holdenkarau
> >
>



-- 
Manoj,
http://github.com/MechCoder


Installing Arrow

2016-07-01 Thread Manoj Kumar
Hi,

I am trying to install Arrow using the following instructions.

./cpp/thirdparty/download_thirdparty.sh
./cpp/thirdparty/build_thirdparty.sh


It fails with this error:

+ cmake -DCMAKE_BUILD_TYPE=Release
-DCMAKE_INSTALL_PREFIX=/home/manoj/arrow/cpp/thirdparty/installed
'-DCMAKE_CXX_FLAGS=-fPIC --std=c++0x' .
-- git Version: v0.0.0-dirty
-- Version: 0.0.0
-- Performing Test HAVE_STD_REGEX
-- Performing Test HAVE_STD_REGEX -- failed to compile
-- Performing Test HAVE_GNU_POSIX_REGEX
-- Performing Test HAVE_GNU_POSIX_REGEX -- failed to compile
-- Performing Test HAVE_POSIX_REGEX
-- Performing Test HAVE_POSIX_REGEX -- failed to compile
-- Performing Test HAVE_STEADY_CLOCK
-- Performing Test HAVE_STEADY_CLOCK -- failed to compile
CMake Error at src/CMakeLists.txt:17 (message):
  Failed to determine the source files for the regular expression backend.


I use CentOS version: 6.7

Any help to fix this, would be appreciated. Thanks!

-- 
Manoj,
http://github.com/MechCoder