[GitHub] kou opened a new pull request #27: [deb] Stop to use -Wdate-time

2018-06-19 Thread GitBox
kou opened a new pull request #27: [deb] Stop to use -Wdate-time
URL: https://github.com/apache/arrow-dist/pull/27
 
 
   Because FlatBuffers uses `__DATE__` and `__TIME__` and it also uses 
`-Werror` flag.
   
   See also  "timeless" value description of "reproducible" in 
http://man7.org/linux/man-pages/man1/dpkg-buildflags.1.html .


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (ARROW-2723) [C++] arrow-orc.pc is missing

2018-06-19 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-2723:
---

 Summary: [C++] arrow-orc.pc is missing
 Key: ARROW-2723
 URL: https://issues.apache.org/jira/browse/ARROW-2723
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Affects Versions: 0.9.0
Reporter: Kouhei Sutou






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2713) [Packaging] Fix linux package builds

2018-06-19 Thread Kouhei Sutou (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-2713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16517646#comment-16517646
 ] 

Kouhei Sutou commented on ARROW-2713:
-

The pull request has been merged.
Can you rebase on master?

> [Packaging] Fix linux package builds
> 
>
> Key: ARROW-2713
> URL: https://issues.apache.org/jira/browse/ARROW-2713
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Krisztian Szucs
>Priority: Major
> Fix For: 0.10.0
>
>
> Build configuration: 
> https://github.com/kszucs/arrow/tree/0d9d89b7bff32823ab68e6ec1dc7ade52511f7ee/dev/tasks/linux-packages
> Failing build: 
> https://travis-ci.org/kszucs/crossbow/builds/391894564?utm_source=github_status&utm_medium=notification
> Looks like it’s waiting for a user input? There might be some hardcoded 
> version too, because the expected is 0.9.1 instead of 0.9.0.
> ping [~kou] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2722) ndarray to arrow conversion fails when downcasted from pandas to_numeric

2018-06-19 Thread Augusto Radtke (JIRA)
Augusto Radtke created ARROW-2722:
-

 Summary: ndarray to arrow conversion fails when downcasted from 
pandas to_numeric
 Key: ARROW-2722
 URL: https://issues.apache.org/jira/browse/ARROW-2722
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++, Python
Affects Versions: 0.9.0
 Environment: Windows 10 64-bit
Reporter: Augusto Radtke


The following snippet:
{code:java}
import numpy as np
import pandas as pd
import pyarrow as pa

pa.array(pd.to_numeric(pd.Series(np.array([65536,2,3], dtype=np.uint64)), 
downcast='unsigned'), 
from_pandas=True, type='uint32')
{code}
fails to convert with message:
{noformat}
ArrowNotImplementedError Traceback (most recent call last)
 in ()
4 
5 pa.array(pd.to_numeric(pd.Series(np.array([65536,2,3], dtype=np.uint64)), 
downcast='unsigned'), 
> 6 from_pandas=True, type='uint32')

array.pxi in pyarrow.lib.array()

array.pxi in pyarrow.lib._ndarray_to_array()

error.pxi in pyarrow.lib.check_status()

ArrowNotImplementedError: Unsupported numpy type 6{noformat}
 

This is a Windows 64-bit machine, running Python 3.6.5, pyarrow 0.9.0, pandas 
0.23.1 and numpy 1.14.5.

Seems to be fine for uint16 or uint8 downcasting. Unfortunately I didn't had 
the time to dig deeper or try on a Linux machine but it feels like its related 
to the LLP64 model.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2372) ArrowIOError: Invalid argument

2018-06-19 Thread Wes McKinney (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516896#comment-16516896
 ] 

Wes McKinney commented on ARROW-2372:
-

The Arrow Python library cannot be installed that way. Refer to the Python 
documentation about instructions to build from source 

> ArrowIOError: Invalid argument
> --
>
> Key: ARROW-2372
> URL: https://issues.apache.org/jira/browse/ARROW-2372
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.8.0, 0.9.0
> Environment: Ubuntu 16.04
>Reporter: Kyle Barron
>Priority: Major
> Fix For: 0.10.0
>
>
> I get an ArrowIOError when reading a specific file that was also written by 
> pyarrow. Specifically, the traceback is:
> {code:python}
> >>> import pyarrow.parquet as pq
> >>> pq.ParquetFile('gaz2016zcta5distancemiles.parquet')
>  ---
>  ArrowIOError Traceback (most recent call last)
>   in ()
>  > 1 pf = pq.ParquetFile('gaz2016zcta5distancemiles.parquet')
> ~/local/anaconda3/lib/python3.6/site-packages/pyarrow/parquet.py in 
> _init_(self, source, metadata, common_metadata)
>  62 self.reader = ParquetReader()
>  63 source = _ensure_file(source)
>  ---> 64 self.reader.open(source, metadata=metadata)
>  65 self.common_metadata = common_metadata
>  66 self._nested_paths_by_prefix = self._build_nested_paths()
> _parquet.pyx in pyarrow._parquet.ParquetReader.open()
> error.pxi in pyarrow.lib.check_status()
> ArrowIOError: Arrow error: IOError: [Errno 22] Invalid argument
> {code}
> Here's a reproducible example with the specific file I'm working with. I'm 
> converting a 34 GB csv file to parquet in chunks of roughly 2GB each. To get 
> the source data:
> {code:bash}
> wget 
> https://www.nber.org/distance/2016/gaz/zcta5/gaz2016zcta5distancemiles.csv.zip
> unzip gaz2016zcta5distancemiles.csv.zip{code}
> Then the basic idea from the [pyarrow Parquet 
> documentation|https://arrow.apache.org/docs/python/parquet.html#finer-grained-reading-and-writing]
>  is instantiating the writer class; looping over chunks of the csv and 
> writing them to parquet; then closing the writer object.
>  
> {code:python}
> import numpy as np
> import pandas as pd
> import pyarrow as pa
> import pyarrow.parquet as pq
> from pathlib import Path
> zcta_file = Path('gaz2016zcta5distancemiles.csv')
> itr = pd.read_csv(
> zcta_file,
> header=0,
> dtype={'zip1': str, 'zip2': str, 'mi_to_zcta5': np.float64},
> engine='c',
> chunksize=64617153)
> schema = pa.schema([
> pa.field('zip1', pa.string()),
> pa.field('zip2', pa.string()),
> pa.field('mi_to_zcta5', pa.float64())])
> writer = pq.ParquetWriter('gaz2016zcta5distancemiles.parquet', schema=schema)
> print(f'Starting conversion')
> i = 0
> for df in itr:
> i += 1
> print(f'Finished reading csv block {i}')
> table = pa.Table.from_pandas(df, preserve_index=False, nthreads=3)
> writer.write_table(table)
> print(f'Finished writing parquet block {i}')
> writer.close()
> {code}
> Then running this python script produces the file 
> {code:java}
> gaz2016zcta5distancemiles.parquet{code}
> , but just attempting to read the metadata with `pq.ParquetFile()` produces 
> the above exception.
> I tested this with pyarrow 0.8 and pyarrow 0.9. I assume that pandas would 
> complain on import of the csv if the columns in the data were not `string`, 
> `string`, and `float64`, so I think creating the Parquet schema in that way 
> should be fine.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2372) ArrowIOError: Invalid argument

2018-06-19 Thread Beatriz (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516868#comment-16516868
 ] 

Beatriz  commented on ARROW-2372:
-

I got the same issue. Trying to run it with the Arrow git master it returns an 
error using: 

*pip install git+[https://github.com/apache/arrow.git]*

FileNotFoundError: [Errno 2] No such file or directory: 
'C:\\Users\\user\\AppData\\Local\\Temp\\pip-req-build-x2lu_5ci\\setup.py'

Am I missing something? thanks

> ArrowIOError: Invalid argument
> --
>
> Key: ARROW-2372
> URL: https://issues.apache.org/jira/browse/ARROW-2372
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 0.8.0, 0.9.0
> Environment: Ubuntu 16.04
>Reporter: Kyle Barron
>Priority: Major
> Fix For: 0.10.0
>
>
> I get an ArrowIOError when reading a specific file that was also written by 
> pyarrow. Specifically, the traceback is:
> {code:python}
> >>> import pyarrow.parquet as pq
> >>> pq.ParquetFile('gaz2016zcta5distancemiles.parquet')
>  ---
>  ArrowIOError Traceback (most recent call last)
>   in ()
>  > 1 pf = pq.ParquetFile('gaz2016zcta5distancemiles.parquet')
> ~/local/anaconda3/lib/python3.6/site-packages/pyarrow/parquet.py in 
> _init_(self, source, metadata, common_metadata)
>  62 self.reader = ParquetReader()
>  63 source = _ensure_file(source)
>  ---> 64 self.reader.open(source, metadata=metadata)
>  65 self.common_metadata = common_metadata
>  66 self._nested_paths_by_prefix = self._build_nested_paths()
> _parquet.pyx in pyarrow._parquet.ParquetReader.open()
> error.pxi in pyarrow.lib.check_status()
> ArrowIOError: Arrow error: IOError: [Errno 22] Invalid argument
> {code}
> Here's a reproducible example with the specific file I'm working with. I'm 
> converting a 34 GB csv file to parquet in chunks of roughly 2GB each. To get 
> the source data:
> {code:bash}
> wget 
> https://www.nber.org/distance/2016/gaz/zcta5/gaz2016zcta5distancemiles.csv.zip
> unzip gaz2016zcta5distancemiles.csv.zip{code}
> Then the basic idea from the [pyarrow Parquet 
> documentation|https://arrow.apache.org/docs/python/parquet.html#finer-grained-reading-and-writing]
>  is instantiating the writer class; looping over chunks of the csv and 
> writing them to parquet; then closing the writer object.
>  
> {code:python}
> import numpy as np
> import pandas as pd
> import pyarrow as pa
> import pyarrow.parquet as pq
> from pathlib import Path
> zcta_file = Path('gaz2016zcta5distancemiles.csv')
> itr = pd.read_csv(
> zcta_file,
> header=0,
> dtype={'zip1': str, 'zip2': str, 'mi_to_zcta5': np.float64},
> engine='c',
> chunksize=64617153)
> schema = pa.schema([
> pa.field('zip1', pa.string()),
> pa.field('zip2', pa.string()),
> pa.field('mi_to_zcta5', pa.float64())])
> writer = pq.ParquetWriter('gaz2016zcta5distancemiles.parquet', schema=schema)
> print(f'Starting conversion')
> i = 0
> for df in itr:
> i += 1
> print(f'Finished reading csv block {i}')
> table = pa.Table.from_pandas(df, preserve_index=False, nthreads=3)
> writer.write_table(table)
> print(f'Finished writing parquet block {i}')
> writer.close()
> {code}
> Then running this python script produces the file 
> {code:java}
> gaz2016zcta5distancemiles.parquet{code}
> , but just attempting to read the metadata with `pq.ParquetFile()` produces 
> the above exception.
> I tested this with pyarrow 0.8 and pyarrow 0.9. I assume that pandas would 
> complain on import of the csv if the columns in the data were not `string`, 
> `string`, and `float64`, so I think creating the Parquet schema in that way 
> should be fine.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-2721) [C++] Link error with Arrow C++ build with -DARROW_ORC=ON on CentOS 7

2018-06-19 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-2721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn reassigned ARROW-2721:
--

Assignee: Kouhei Sutou

> [C++] Link error with Arrow C++ build with -DARROW_ORC=ON on CentOS 7
> -
>
> Key: ARROW-2721
> URL: https://issues.apache.org/jira/browse/ARROW-2721
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Build master with -DARROW_ORC=ON:
> {code:shell}
> sudo yum install -y epel-release
> sudo yum groupinstall -y "Development Tools"
> sudo yum install -y \
>   autoconf-archive \
>   boost-devel \
>   cmake3 \
>   git \
>   gobject-introspection-devel \
>   gtk-doc \
>   jemalloc-devel \
>   pkg-config \
>   tar
> git clone https://github.com/apache/arrow.git
> mkdir -p arrow/cpp/build
> cd arrow/cpp/build
> LANG=C cmake3 .. -DCMAKE_BUILD_TYPE=release -DARROW_ORC=ON
> make -j4
> sudo make install
> {code}
> Sample program:
> {code:cpp}
> #include 
> int main(void) {
>   return 0;
> }
> {code}
> Build the sample program:
> {noformat}
> % g++ -std=c++11 -o sample $(PKG_CONFIG_PATH=/usr/local/lib64/pkgconfig 
> pkg-config --cflags --libs arrow) sample.cpp
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::WireFormat::SerializeUnknownFields(google::protobuf::UnknownFieldSet
>  const&, google::protobuf::io::CodedOutputStream*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::Message::SpaceUsed() const'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::DescriptorPool::FindFileByName(std::string const&) const'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::WireFormatLite::WriteDouble(int, double, 
> google::protobuf::io::CodedOutputStream*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::io::CodedInputStream::BytesUntilTotalBytesLimit() const'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::DescriptorPool::InternalAddGeneratedFile(void const*, int)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::StringTypeHandlerBase::New()'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::io::ZeroCopyOutputStream::~ZeroCopyOutputStream()'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::Message::CheckTypeAndMergeFrom(google::protobuf::MessageLite
>  const&)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::MessageLite::ParseFromZeroCopyStream(google::protobuf::io::ZeroCopyInputStream*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::io::CodedInputStream::ReadRaw(void*, int)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::MessageFactory::InternalRegisterGeneratedFile(char const*, 
> void (*)(std::string const&))'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::io::CodedOutputStream::VarintSize32Fallback(unsigned int)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::LogMessage::LogMessage(google::protobuf::LogLevel,
>  char const*, int)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::empty_string_'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::StringTypeHandlerBase::Delete(std::string*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::io::CodedOutputStream::WriteVarint64(unsigned long)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::DescriptorPool::generated_pool()'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::WireFormatLite::WriteEnum(int, int, 
> google::protobuf::io::CodedOutputStream*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::WireFormatLite::WriteString(int, std::string 
> const&, google::protobuf::io::CodedOutputStream*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::WireFormat::SerializeUnknownFieldsToArray(google::protobuf::UnknownFieldSet
>  const&, unsigned char*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::io::CodedInputStream::ReadTagFallback()'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::OnShutdown(void (*)())'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::RepeatedPtrFieldBase::Swap(google::protobuf::internal::Rep

[jira] [Resolved] (ARROW-2721) [C++] Link error with Arrow C++ build with -DARROW_ORC=ON on CentOS 7

2018-06-19 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-2721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-2721.

   Resolution: Fixed
Fix Version/s: 0.10.0

Issue resolved by pull request 2146
[https://github.com/apache/arrow/pull/2146]

> [C++] Link error with Arrow C++ build with -DARROW_ORC=ON on CentOS 7
> -
>
> Key: ARROW-2721
> URL: https://issues.apache.org/jira/browse/ARROW-2721
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Kouhei Sutou
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Build master with -DARROW_ORC=ON:
> {code:shell}
> sudo yum install -y epel-release
> sudo yum groupinstall -y "Development Tools"
> sudo yum install -y \
>   autoconf-archive \
>   boost-devel \
>   cmake3 \
>   git \
>   gobject-introspection-devel \
>   gtk-doc \
>   jemalloc-devel \
>   pkg-config \
>   tar
> git clone https://github.com/apache/arrow.git
> mkdir -p arrow/cpp/build
> cd arrow/cpp/build
> LANG=C cmake3 .. -DCMAKE_BUILD_TYPE=release -DARROW_ORC=ON
> make -j4
> sudo make install
> {code}
> Sample program:
> {code:cpp}
> #include 
> int main(void) {
>   return 0;
> }
> {code}
> Build the sample program:
> {noformat}
> % g++ -std=c++11 -o sample $(PKG_CONFIG_PATH=/usr/local/lib64/pkgconfig 
> pkg-config --cflags --libs arrow) sample.cpp
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::WireFormat::SerializeUnknownFields(google::protobuf::UnknownFieldSet
>  const&, google::protobuf::io::CodedOutputStream*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::Message::SpaceUsed() const'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::DescriptorPool::FindFileByName(std::string const&) const'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::WireFormatLite::WriteDouble(int, double, 
> google::protobuf::io::CodedOutputStream*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::io::CodedInputStream::BytesUntilTotalBytesLimit() const'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::DescriptorPool::InternalAddGeneratedFile(void const*, int)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::StringTypeHandlerBase::New()'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::io::ZeroCopyOutputStream::~ZeroCopyOutputStream()'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::Message::CheckTypeAndMergeFrom(google::protobuf::MessageLite
>  const&)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::MessageLite::ParseFromZeroCopyStream(google::protobuf::io::ZeroCopyInputStream*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::io::CodedInputStream::ReadRaw(void*, int)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::MessageFactory::InternalRegisterGeneratedFile(char const*, 
> void (*)(std::string const&))'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::io::CodedOutputStream::VarintSize32Fallback(unsigned int)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::LogMessage::LogMessage(google::protobuf::LogLevel,
>  char const*, int)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::empty_string_'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::StringTypeHandlerBase::Delete(std::string*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::io::CodedOutputStream::WriteVarint64(unsigned long)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::DescriptorPool::generated_pool()'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::WireFormatLite::WriteEnum(int, int, 
> google::protobuf::io::CodedOutputStream*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::WireFormatLite::WriteString(int, std::string 
> const&, google::protobuf::io::CodedOutputStream*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::WireFormat::SerializeUnknownFieldsToArray(google::protobuf::UnknownFieldSet
>  const&, unsigned char*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::io::CodedInputStream::ReadTagFallback()'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::OnShutdown(void (*)())'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::inte

[jira] [Commented] (ARROW-2713) [Packaging] Fix linux package builds

2018-06-19 Thread Kouhei Sutou (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-2713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516761#comment-16516761
 ] 

Kouhei Sutou commented on ARROW-2713:
-

https://github.com/apache/arrow/pull/2146 fixes the error on CentOS 7.


> [Packaging] Fix linux package builds
> 
>
> Key: ARROW-2713
> URL: https://issues.apache.org/jira/browse/ARROW-2713
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Krisztian Szucs
>Priority: Major
> Fix For: 0.10.0
>
>
> Build configuration: 
> https://github.com/kszucs/arrow/tree/0d9d89b7bff32823ab68e6ec1dc7ade52511f7ee/dev/tasks/linux-packages
> Failing build: 
> https://travis-ci.org/kszucs/crossbow/builds/391894564?utm_source=github_status&utm_medium=notification
> Looks like it’s waiting for a user input? There might be some hardcoded 
> version too, because the expected is 0.9.1 instead of 0.9.0.
> ping [~kou] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-2721) [C++] Link error with Arrow C++ build with -DARROW_ORC=ON on CentOS 7

2018-06-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-2721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-2721:
--
Labels: pull-request-available  (was: )

> [C++] Link error with Arrow C++ build with -DARROW_ORC=ON on CentOS 7
> -
>
> Key: ARROW-2721
> URL: https://issues.apache.org/jira/browse/ARROW-2721
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Kouhei Sutou
>Priority: Minor
>  Labels: pull-request-available
>
> Build master with -DARROW_ORC=ON:
> {code:shell}
> sudo yum install -y epel-release
> sudo yum groupinstall -y "Development Tools"
> sudo yum install -y \
>   autoconf-archive \
>   boost-devel \
>   cmake3 \
>   git \
>   gobject-introspection-devel \
>   gtk-doc \
>   jemalloc-devel \
>   pkg-config \
>   tar
> git clone https://github.com/apache/arrow.git
> mkdir -p arrow/cpp/build
> cd arrow/cpp/build
> LANG=C cmake3 .. -DCMAKE_BUILD_TYPE=release -DARROW_ORC=ON
> make -j4
> sudo make install
> {code}
> Sample program:
> {code:cpp}
> #include 
> int main(void) {
>   return 0;
> }
> {code}
> Build the sample program:
> {noformat}
> % g++ -std=c++11 -o sample $(PKG_CONFIG_PATH=/usr/local/lib64/pkgconfig 
> pkg-config --cflags --libs arrow) sample.cpp
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::WireFormat::SerializeUnknownFields(google::protobuf::UnknownFieldSet
>  const&, google::protobuf::io::CodedOutputStream*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::Message::SpaceUsed() const'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::DescriptorPool::FindFileByName(std::string const&) const'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::WireFormatLite::WriteDouble(int, double, 
> google::protobuf::io::CodedOutputStream*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::io::CodedInputStream::BytesUntilTotalBytesLimit() const'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::DescriptorPool::InternalAddGeneratedFile(void const*, int)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::StringTypeHandlerBase::New()'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::io::ZeroCopyOutputStream::~ZeroCopyOutputStream()'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::Message::CheckTypeAndMergeFrom(google::protobuf::MessageLite
>  const&)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::MessageLite::ParseFromZeroCopyStream(google::protobuf::io::ZeroCopyInputStream*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::io::CodedInputStream::ReadRaw(void*, int)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::MessageFactory::InternalRegisterGeneratedFile(char const*, 
> void (*)(std::string const&))'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::io::CodedOutputStream::VarintSize32Fallback(unsigned int)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::LogMessage::LogMessage(google::protobuf::LogLevel,
>  char const*, int)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::empty_string_'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::StringTypeHandlerBase::Delete(std::string*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::io::CodedOutputStream::WriteVarint64(unsigned long)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::DescriptorPool::generated_pool()'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::WireFormatLite::WriteEnum(int, int, 
> google::protobuf::io::CodedOutputStream*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::WireFormatLite::WriteString(int, std::string 
> const&, google::protobuf::io::CodedOutputStream*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::WireFormat::SerializeUnknownFieldsToArray(google::protobuf::UnknownFieldSet
>  const&, unsigned char*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::io::CodedInputStream::ReadTagFallback()'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::OnShutdown(void (*)())'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::internal::RepeatedPtrFieldBase::Swap(google::protobuf::internal::RepeatedPtrFieldBase*)'
> /usr/local/lib64/libarrow.so: undefined reference to 
> `google::protobuf::MessageF

[GitHub] xhochy closed pull request #26: [RPM] Remove duplicated -DARROW_ORC=ON

2018-06-19 Thread GitBox
xhochy closed pull request #26: [RPM] Remove duplicated -DARROW_ORC=ON
URL: https://github.com/apache/arrow-dist/pull/26
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/cpp-linux/yum/arrow.spec.in b/cpp-linux/yum/arrow.spec.in
index 65c357e..013aaec 100644
--- a/cpp-linux/yum/arrow.spec.in
+++ b/cpp-linux/yum/arrow.spec.in
@@ -60,7 +60,6 @@ mkdir cpp/build
 cd cpp/build
 %cmake3 .. \
   -DCMAKE_BUILD_TYPE=$build_type \
-  -DARROW_ORC=ON \
 %if %{use_python}
   -DARROW_PYTHON=ON \
   -DPythonInterp_FIND_VERSION=ON \


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (ARROW-2713) [Packaging] Fix linux package builds

2018-06-19 Thread Kouhei Sutou (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-2713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516742#comment-16516742
 ] 

Kouhei Sutou commented on ARROW-2713:
-

For Debian GNU/Linux stretch:

FlatBuffers has build error:

{noformat}
cat 
/build/apache-arrow-0.10.0.20180617/cpp_build/flatbuffers_ep-prefix/src/flatbuffers_ep-stamp/flatbuffers_ep-build-err.log
 
/build/apache-arrow-0.10.0.20180617/cpp_build/flatbuffers_ep-prefix/src/flatbuffers_ep/src/flatc.cpp:250:38:
 error: macro "__DATE__" might prevent reproducible builds [-Werror=date-time]
 printf("flatc version %s\n", FLATC_VERSION);
  ^
/build/apache-arrow-0.10.0.20180617/cpp_build/flatbuffers_ep-prefix/src/flatbuffers_ep/src/flatc.cpp:21:33:
 error: macro "__TIME__" might prevent reproducible builds [-Werror=date-time]
 #define FLATC_VERSION "1.9.0 (" __DATE__ " " __TIME__ ")"
 ^
/build/apache-arrow-0.10.0.20180617/cpp_build/flatbuffers_ep-prefix/src/flatbuffers_ep/src/flatc.cpp:250:38:
 note: in expansion of macro 'FLATC_VERSION'
 printf("flatc version %s\n", FLATC_VERSION);
  ^
cc1plus: all warnings being treated as errors
{noformat}

Debian GNU/Linux stretch adds -Wdate-time flag by default and FlatBuffers uses 
-Werror flag. It causes the error.

See also "timeless" value description of "reproducible" in 
http://man7.org/linux/man-pages/man1/dpkg-buildflags.1.html .

We can disable the feature by:

{code:diff}
diff --git a/dev/tasks/linux-packages/debian/rules 
b/dev/tasks/linux-packages/debian/rules
index 5a3c0a31..5af70521 100755
--- a/dev/tasks/linux-packages/debian/rules
+++ b/dev/tasks/linux-packages/debian/rules
@@ -6,6 +6,8 @@
 # This has to be exported to make some magic below work.
 export DH_OPTIONS
 
+export DEB_BUILD_MAINT_OPTIONS=reproducible=-timeless
+
 BUILD_TYPE=release
 
 %:
{code}

> [Packaging] Fix linux package builds
> 
>
> Key: ARROW-2713
> URL: https://issues.apache.org/jira/browse/ARROW-2713
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Krisztian Szucs
>Priority: Major
> Fix For: 0.10.0
>
>
> Build configuration: 
> https://github.com/kszucs/arrow/tree/0d9d89b7bff32823ab68e6ec1dc7ade52511f7ee/dev/tasks/linux-packages
> Failing build: 
> https://travis-ci.org/kszucs/crossbow/builds/391894564?utm_source=github_status&utm_medium=notification
> Looks like it’s waiting for a user input? There might be some hardcoded 
> version too, because the expected is 0.9.1 instead of 0.9.0.
> ping [~kou] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-2721) [C++] Link error with Arrow C++ build with -DARROW_ORC=ON on CentOS 7

2018-06-19 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-2721:
---

 Summary: [C++] Link error with Arrow C++ build with -DARROW_ORC=ON 
on CentOS 7
 Key: ARROW-2721
 URL: https://issues.apache.org/jira/browse/ARROW-2721
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Kouhei Sutou


Build master with -DARROW_ORC=ON:

{code:shell}
sudo yum install -y epel-release
sudo yum groupinstall -y "Development Tools"
sudo yum install -y \
  autoconf-archive \
  boost-devel \
  cmake3 \
  git \
  gobject-introspection-devel \
  gtk-doc \
  jemalloc-devel \
  pkg-config \
  tar
git clone https://github.com/apache/arrow.git
mkdir -p arrow/cpp/build
cd arrow/cpp/build
LANG=C cmake3 .. -DCMAKE_BUILD_TYPE=release -DARROW_ORC=ON
make -j4
sudo make install
{code}

Sample program:

{code:cpp}
#include 

int main(void) {
  return 0;
}
{code}

Build the sample program:

{noformat}
% g++ -std=c++11 -o sample $(PKG_CONFIG_PATH=/usr/local/lib64/pkgconfig 
pkg-config --cflags --libs arrow) sample.cpp
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::internal::WireFormat::SerializeUnknownFields(google::protobuf::UnknownFieldSet
 const&, google::protobuf::io::CodedOutputStream*)'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::Message::SpaceUsed() const'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::DescriptorPool::FindFileByName(std::string const&) const'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::internal::WireFormatLite::WriteDouble(int, double, 
google::protobuf::io::CodedOutputStream*)'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::io::CodedInputStream::BytesUntilTotalBytesLimit() const'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::DescriptorPool::InternalAddGeneratedFile(void const*, int)'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::internal::StringTypeHandlerBase::New()'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::io::ZeroCopyOutputStream::~ZeroCopyOutputStream()'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::Message::CheckTypeAndMergeFrom(google::protobuf::MessageLite 
const&)'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::MessageLite::ParseFromZeroCopyStream(google::protobuf::io::ZeroCopyInputStream*)'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::io::CodedInputStream::ReadRaw(void*, int)'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::MessageFactory::InternalRegisterGeneratedFile(char const*, 
void (*)(std::string const&))'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::io::CodedOutputStream::VarintSize32Fallback(unsigned int)'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::internal::LogMessage::LogMessage(google::protobuf::LogLevel, 
char const*, int)'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::internal::empty_string_'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::internal::StringTypeHandlerBase::Delete(std::string*)'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::io::CodedOutputStream::WriteVarint64(unsigned long)'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::DescriptorPool::generated_pool()'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::internal::WireFormatLite::WriteEnum(int, int, 
google::protobuf::io::CodedOutputStream*)'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::internal::WireFormatLite::WriteString(int, std::string 
const&, google::protobuf::io::CodedOutputStream*)'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::internal::WireFormat::SerializeUnknownFieldsToArray(google::protobuf::UnknownFieldSet
 const&, unsigned char*)'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::io::CodedInputStream::ReadTagFallback()'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::internal::OnShutdown(void (*)())'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::internal::RepeatedPtrFieldBase::Swap(google::protobuf::internal::RepeatedPtrFieldBase*)'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::MessageFactory::generated_factory()'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::UnknownFieldSet::AddVarint(int, unsigned long)'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::io::CodedInputStream::Skip(int)'
/usr/local/lib64/libarrow.so: undefined reference to 
`google::protobuf::internal::WireFormat::ComputeUnknownFieldsSize(google::protobuf::UnknownFieldSet
 const&)'
/usr/local/lib64/libarrow.so: undefined reference to 
`google: