[jira] [Updated] (ARROW-6991) [Packaging][deb] Add support for Ubuntu 19.10
[ https://issues.apache.org/jira/browse/ARROW-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6991: -- Labels: pull-request-available (was: ) > [Packaging][deb] Add support for Ubuntu 19.10 > - > > Key: ARROW-6991 > URL: https://issues.apache.org/jira/browse/ARROW-6991 > Project: Apache Arrow > Issue Type: Improvement > Components: Packaging >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6991) [Packaging][deb] Add support for Ubuntu 19.10
Kouhei Sutou created ARROW-6991: --- Summary: [Packaging][deb] Add support for Ubuntu 19.10 Key: ARROW-6991 URL: https://issues.apache.org/jira/browse/ARROW-6991 Project: Apache Arrow Issue Type: Improvement Components: Packaging Reporter: Kouhei Sutou Assignee: Kouhei Sutou -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6598) [Java] Sort the code for ApproxEqualsVisitor
[ https://issues.apache.org/jira/browse/ARROW-6598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved ARROW-6598. Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request 5418 [https://github.com/apache/arrow/pull/5418] > [Java] Sort the code for ApproxEqualsVisitor > > > Key: ARROW-6598 > URL: https://issues.apache.org/jira/browse/ARROW-6598 > Project: Apache Arrow > Issue Type: Improvement > Components: Java >Reporter: Liya Fan >Assignee: Liya Fan >Priority: Minor > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > As a follow up issue of ARROW-6458, we finalize the code for > ApproxEqualsVisitor. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6989) [Python][C++] Assert is triggered when decimal type inference occurs on a value with out of range precision
[ https://issues.apache.org/jira/browse/ARROW-6989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated ARROW-6989: --- Component/s: C++ > [Python][C++] Assert is triggered when decimal type inference occurs on a > value with out of range precision > --- > > Key: ARROW-6989 > URL: https://issues.apache.org/jira/browse/ARROW-6989 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Micah Kornfield >Priority: Major > > Example: > pa.array([decimal.Decimal(123.234)] ) > > The problem is that inference.cc calls the direct constructor for decimal > types instead using Make. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6990) [C++] Support casting between decimal types with compatible precision/scales
Micah Kornfield created ARROW-6990: -- Summary: [C++] Support casting between decimal types with compatible precision/scales Key: ARROW-6990 URL: https://issues.apache.org/jira/browse/ARROW-6990 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Micah Kornfield This seems like a reasonable thing to support and showed up as a question on the user mailing list (through some sort of python code). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6989) [Python][C++] Assert is triggered when decimal type inference occurs on a value with out of range precision
Micah Kornfield created ARROW-6989: -- Summary: [Python][C++] Assert is triggered when decimal type inference occurs on a value with out of range precision Key: ARROW-6989 URL: https://issues.apache.org/jira/browse/ARROW-6989 Project: Apache Arrow Issue Type: Bug Reporter: Micah Kornfield Example: pa.array([decimal.Decimal(123.234)] ) The problem is that inference.cc calls the direct constructor for decimal types instead using Make. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6988) [CI][R] Buildbot's R Conda is failing
[ https://issues.apache.org/jira/browse/ARROW-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16959392#comment-16959392 ] Neal Richardson commented on ARROW-6988: Everywhere else the R builds are passing, including the Conda R job on Krisztián's docker-compose/GitHub-Actions branch, so I'm not yet convinced this is a real problem. > [CI][R] Buildbot's R Conda is failing > - > > Key: ARROW-6988 > URL: https://issues.apache.org/jira/browse/ARROW-6988 > Project: Apache Arrow > Issue Type: Improvement > Components: Continuous Integration, R >Reporter: Francois Saint-Jacques >Priority: Major > > {code:java} > Running ‘testthat.R’ > ERROR > Running the tests in ‘tests/testthat.R’ failed. > Last 13 lines of output: > 25: tryCatch(withCallingHandlers({eval(code, test_env)if (!handled > && !is.null(test)) {skip_empty()}}, expectation = > handle_expectation, skip = handle_skip, warning = handle_warning, message > = handle_message, error = handle_error), error = handle_fatal, skip = > function(e) {}) > 26: test_code(NULL, exprs, env) > 27: source_file(path, new.env(parent = env), chdir = TRUE, wrap = wrap) > 28: force(code) > 29: with_reporter(reporter = reporter, start_end_reporter = > start_end_reporter, {reporter$start_file(basename(path)) > lister$start_file(basename(path))source_file(path, new.env(parent = > env), chdir = TRUE, wrap = wrap)reporter$.end_context() > reporter$end_file()}) > 30: FUN(X[[i]], ...) > 31: lapply(paths, test_file, env = env, reporter = current_reporter, > start_end_reporter = FALSE, load_helpers = FALSE, wrap = wrap) > 32: force(code) > 33: with_reporter(reporter = current_reporter, results <- lapply(paths, > test_file, env = env, reporter = current_reporter, start_end_reporter = > FALSE, load_helpers = FALSE, wrap = wrap)) > 34: test_files(paths, reporter = reporter, env = env, stop_on_failure = > stop_on_failure, stop_on_warning = stop_on_warning, wrap = wrap) > 35: test_dir(path = test_path, reporter = reporter, env = env, filter = > filter, ..., stop_on_failure = stop_on_failure, stop_on_warning = > stop_on_warning, wrap = wrap) > 36: test_package_dir(package = package, test_path = test_path, filter = > filter, reporter = reporter, ..., stop_on_failure = stop_on_failure, > stop_on_warning = stop_on_warning, wrap = wrap) > 37: test_check("arrow") > An irrecoverable exception occurred. R is aborting now ... > Segmentation fault (core dumped) > * checking for unstated dependencies in vignettes ... OK > * checking package vignettes in ‘inst/doc’ ... OK > * checking re-building of vignette outputs ... OK > * DONE > Status: 1 ERROR, 1 WARNING, 2 NOTEs > See > ‘/buildbot/AMD64_Conda_R/r/arrow.Rcheck/00check.log’ > for details. > {code} > [|https://ci.ursalabs.org/#/builders/95] > [https://ci.ursalabs.org/#/builders/95/builds/2386] > [https://ci.ursalabs.org/#/builders/95] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6969) [C++][Dataset] ParquetScanTask eagerly load file
[ https://issues.apache.org/jira/browse/ARROW-6969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques resolved ARROW-6969. --- Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request 5725 [https://github.com/apache/arrow/pull/5725] > [C++][Dataset] ParquetScanTask eagerly load file > - > > Key: ARROW-6969 > URL: https://issues.apache.org/jira/browse/ARROW-6969 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Dataset >Reporter: Francois Saint-Jacques >Assignee: Francois Saint-Jacques >Priority: Major > Labels: dataset, pull-request-available > Fix For: 1.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > The file content should only be read when invoking ParquetScanTask::Scan, not > on construction. This blocks reading in a true streaming fashion with memory > constraints. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6969) [C++][Dataset] ParquetScanTask eagerly load file
[ https://issues.apache.org/jira/browse/ARROW-6969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-6969: -- Component/s: C++ > [C++][Dataset] ParquetScanTask eagerly load file > - > > Key: ARROW-6969 > URL: https://issues.apache.org/jira/browse/ARROW-6969 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Dataset >Reporter: Francois Saint-Jacques >Assignee: Francois Saint-Jacques >Priority: Major > Labels: dataset, pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > The file content should only be read when invoking ParquetScanTask::Scan, not > on construction. This blocks reading in a true streaming fashion with memory > constraints. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6964) [C++][Dataset] Expose a nested parallel option for Scanner::ToTable
[ https://issues.apache.org/jira/browse/ARROW-6964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques resolved ARROW-6964. --- Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request 5721 [https://github.com/apache/arrow/pull/5721] > [C++][Dataset] Expose a nested parallel option for Scanner::ToTable > --- > > Key: ARROW-6964 > URL: https://issues.apache.org/jira/browse/ARROW-6964 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Dataset >Reporter: Francois Saint-Jacques >Assignee: Francois Saint-Jacques >Priority: Major > Labels: dataset, pull-request-available > Fix For: 1.0.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6964) [C++][Dataset] Expose a nested parallel option for Scanner::ToTable
[ https://issues.apache.org/jira/browse/ARROW-6964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-6964: -- Component/s: C++ > [C++][Dataset] Expose a nested parallel option for Scanner::ToTable > --- > > Key: ARROW-6964 > URL: https://issues.apache.org/jira/browse/ARROW-6964 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Dataset >Reporter: Francois Saint-Jacques >Assignee: Francois Saint-Jacques >Priority: Major > Labels: dataset, pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6966) [Go] 32bit memset is null
[ https://issues.apache.org/jira/browse/ARROW-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-6966: -- Component/s: Go > [Go] 32bit memset is null > - > > Key: ARROW-6966 > URL: https://issues.apache.org/jira/browse/ARROW-6966 > Project: Apache Arrow > Issue Type: Bug > Components: Go >Reporter: Jonathan A Sternberg >Assignee: Jonathan A Sternberg >Priority: Minor > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > If you use a function that calls `memset.Set`, the implementation on a 32 bit > machine seems to be unset. This happened in our 32 bit build here: > [https://circleci.com/gh/influxdata/influxdb/66112#tests/containers/2] > {code:java} > goroutine 66 [running]:goroutine 66 > [running]:testing.tRunner.func1(0x9e1f2c0) > /usr/local/go/src/testing/testing.go:830 +0x30epanic(0x899cb40, 0x9403c40) > /usr/local/go/src/runtime/panic.go:522 > +0x16egithub.com/influxdata/influxdb/vendor/github.com/apache/arrow/go/arrow/memory.Set(...) > > /root/go/src/github.com/influxdata/influxdb/vendor/github.com/apache/arrow/go/arrow/memory/memory.go:25github.com/influxdata/influxdb/vendor/github.com/apache/arrow/go/arrow/array.(*builder).init(0x9e44990, > 0x20) > /root/go/src/github.com/influxdata/influxdb/vendor/github.com/apache/arrow/go/arrow/array/builder.go:101 > > +0xc7github.com/influxdata/influxdb/vendor/github.com/apache/arrow/go/arrow/array.(*Int64Builder).init(0x9e44990, > 0x20) > /root/go/src/github.com/influxdata/influxdb/vendor/github.com/apache/arrow/go/arrow/array/numericbuilder.gen.go:102 > > +0x2fgithub.com/influxdata/influxdb/vendor/github.com/apache/arrow/go/arrow/array.(*Int64Builder).Resize(0x9e44990, > 0x2) > /root/go/src/github.com/influxdata/influxdb/vendor/github.com/apache/arrow/go/arrow/array/numericbuilder.gen.go:125 > > +0x42github.com/influxdata/influxdb/vendor/github.com/apache/arrow/go/arrow/array.(*builder).reserve(0x9e44990, > 0x1, 0x9c52464) > /root/go/src/github.com/influxdata/influxdb/vendor/github.com/apache/arrow/go/arrow/array/builder.go:138 > > +0x72github.com/influxdata/influxdb/vendor/github.com/apache/arrow/go/arrow/array.(*Int64Builder).Reserve(0x9e44990, > 0x1) > /root/go/src/github.com/influxdata/influxdb/vendor/github.com/apache/arrow/go/arrow/array/numericbuilder.gen.go:113 > > +0x51github.com/influxdata/influxdb/vendor/github.com/influxdata/flux/arrow.NewInt(0x9e4a770, > 0x1, 0x1, 0x0, 0x89f0360) > /root/go/src/github.com/influxdata/influxdb/vendor/github.com/influxdata/flux/arrow/int.go:10 > > +0x6cgithub.com/influxdata/influxdb/storage/reads.(*floatTable).advance(0x9e42070, > 0x0) > /root/go/src/github.com/influxdata/influxdb/storage/reads/table.gen.go:91 > +0x7egithub.com/influxdata/influxdb/storage/reads.newFloatTable(0x9e17740, > 0xe521a160, 0x9e1b8c0, 0x0, 0x0, 0x1e, 0x0, 0x8c13be0, 0x9e448a0, 0x9e448d0, > ...) > /root/go/src/github.com/influxdata/influxdb/storage/reads/table.gen.go:47 > +0x1c2github.com/influxdata/influxdb/storage/reads.(*filterIterator).handleRead(0x9e22840, > 0x9e0d1a0, 0x8c0ce00, 0x9e48780, 0x0, 0x0) > /root/go/src/github.com/influxdata/influxdb/storage/reads/reader.go:177 > +0x755github.com/influxdata/influxdb/storage/reads.(*filterIterator).Do(0x9e22840, > 0x9e0d170, 0x9c40070, 0x0) > /root/go/src/github.com/influxdata/influxdb/storage/reads/reader.go:140 > +0x138github.com/influxdata/influxdb/storage/reads_test.TestDuplicateKeys_ReadFilter(0x9e1f2c0) > /root/go/src/github.com/influxdata/influxdb/storage/reads/reader_test.go:89 > +0x1dftesting.tRunner(0x9e1f2c0, 0x8ad44e4) > /usr/local/go/src/testing/testing.go:865 +0x97created by testing.(*T).Run > /usr/local/go/src/testing/testing.go:916 +0x2b2 > {code} > I added a print statement at where memset happened to print the function that > was being used and got this: > {code} > [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] 0 > {code} > If I set {{memset}} with a default, the code that calls into this works fine. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6956) [C++] Status should use unique_ptr
[ https://issues.apache.org/jira/browse/ARROW-6956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-6956: -- Component/s: C++ > [C++] Status should use unique_ptr > -- > > Key: ARROW-6956 > URL: https://issues.apache.org/jira/browse/ARROW-6956 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Francois Saint-Jacques >Priority: Minor > > The logic of Status::State is _very_ similar to unique_ptr except the deep > copy on copy. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6988) [CI][R] Buildbot's R Conda is failing
[ https://issues.apache.org/jira/browse/ARROW-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-6988: -- Component/s: R Continuous Integration > [CI][R] Buildbot's R Conda is failing > - > > Key: ARROW-6988 > URL: https://issues.apache.org/jira/browse/ARROW-6988 > Project: Apache Arrow > Issue Type: Improvement > Components: Continuous Integration, R >Reporter: Francois Saint-Jacques >Priority: Major > > {code:java} > Running ‘testthat.R’ > ERROR > Running the tests in ‘tests/testthat.R’ failed. > Last 13 lines of output: > 25: tryCatch(withCallingHandlers({eval(code, test_env)if (!handled > && !is.null(test)) {skip_empty()}}, expectation = > handle_expectation, skip = handle_skip, warning = handle_warning, message > = handle_message, error = handle_error), error = handle_fatal, skip = > function(e) {}) > 26: test_code(NULL, exprs, env) > 27: source_file(path, new.env(parent = env), chdir = TRUE, wrap = wrap) > 28: force(code) > 29: with_reporter(reporter = reporter, start_end_reporter = > start_end_reporter, {reporter$start_file(basename(path)) > lister$start_file(basename(path))source_file(path, new.env(parent = > env), chdir = TRUE, wrap = wrap)reporter$.end_context() > reporter$end_file()}) > 30: FUN(X[[i]], ...) > 31: lapply(paths, test_file, env = env, reporter = current_reporter, > start_end_reporter = FALSE, load_helpers = FALSE, wrap = wrap) > 32: force(code) > 33: with_reporter(reporter = current_reporter, results <- lapply(paths, > test_file, env = env, reporter = current_reporter, start_end_reporter = > FALSE, load_helpers = FALSE, wrap = wrap)) > 34: test_files(paths, reporter = reporter, env = env, stop_on_failure = > stop_on_failure, stop_on_warning = stop_on_warning, wrap = wrap) > 35: test_dir(path = test_path, reporter = reporter, env = env, filter = > filter, ..., stop_on_failure = stop_on_failure, stop_on_warning = > stop_on_warning, wrap = wrap) > 36: test_package_dir(package = package, test_path = test_path, filter = > filter, reporter = reporter, ..., stop_on_failure = stop_on_failure, > stop_on_warning = stop_on_warning, wrap = wrap) > 37: test_check("arrow") > An irrecoverable exception occurred. R is aborting now ... > Segmentation fault (core dumped) > * checking for unstated dependencies in vignettes ... OK > * checking package vignettes in ‘inst/doc’ ... OK > * checking re-building of vignette outputs ... OK > * DONE > Status: 1 ERROR, 1 WARNING, 2 NOTEs > See > ‘/buildbot/AMD64_Conda_R/r/arrow.Rcheck/00check.log’ > for details. > {code} > [|https://ci.ursalabs.org/#/builders/95] > [https://ci.ursalabs.org/#/builders/95/builds/2386] > [https://ci.ursalabs.org/#/builders/95] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6988) [CI][R] Buildbot's R Conda is failing
Francois Saint-Jacques created ARROW-6988: - Summary: [CI][R] Buildbot's R Conda is failing Key: ARROW-6988 URL: https://issues.apache.org/jira/browse/ARROW-6988 Project: Apache Arrow Issue Type: Improvement Reporter: Francois Saint-Jacques {code:java} Running ‘testthat.R’ ERROR Running the tests in ‘tests/testthat.R’ failed. Last 13 lines of output: 25: tryCatch(withCallingHandlers({eval(code, test_env)if (!handled && !is.null(test)) {skip_empty()}}, expectation = handle_expectation, skip = handle_skip, warning = handle_warning, message = handle_message, error = handle_error), error = handle_fatal, skip = function(e) {}) 26: test_code(NULL, exprs, env) 27: source_file(path, new.env(parent = env), chdir = TRUE, wrap = wrap) 28: force(code) 29: with_reporter(reporter = reporter, start_end_reporter = start_end_reporter, {reporter$start_file(basename(path)) lister$start_file(basename(path))source_file(path, new.env(parent = env), chdir = TRUE, wrap = wrap)reporter$.end_context() reporter$end_file()}) 30: FUN(X[[i]], ...) 31: lapply(paths, test_file, env = env, reporter = current_reporter, start_end_reporter = FALSE, load_helpers = FALSE, wrap = wrap) 32: force(code) 33: with_reporter(reporter = current_reporter, results <- lapply(paths, test_file, env = env, reporter = current_reporter, start_end_reporter = FALSE, load_helpers = FALSE, wrap = wrap)) 34: test_files(paths, reporter = reporter, env = env, stop_on_failure = stop_on_failure, stop_on_warning = stop_on_warning, wrap = wrap) 35: test_dir(path = test_path, reporter = reporter, env = env, filter = filter, ..., stop_on_failure = stop_on_failure, stop_on_warning = stop_on_warning, wrap = wrap) 36: test_package_dir(package = package, test_path = test_path, filter = filter, reporter = reporter, ..., stop_on_failure = stop_on_failure, stop_on_warning = stop_on_warning, wrap = wrap) 37: test_check("arrow") An irrecoverable exception occurred. R is aborting now ... Segmentation fault (core dumped) * checking for unstated dependencies in vignettes ... OK * checking package vignettes in ‘inst/doc’ ... OK * checking re-building of vignette outputs ... OK * DONE Status: 1 ERROR, 1 WARNING, 2 NOTEs See ‘/buildbot/AMD64_Conda_R/r/arrow.Rcheck/00check.log’ for details. {code} [|https://ci.ursalabs.org/#/builders/95] [https://ci.ursalabs.org/#/builders/95/builds/2386] [https://ci.ursalabs.org/#/builders/95] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6987) [CI] Travis OSX failing to install sdk headers
[ https://issues.apache.org/jira/browse/ARROW-6987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6987: -- Labels: pull-request-available (was: ) > [CI] Travis OSX failing to install sdk headers > -- > > Key: ARROW-6987 > URL: https://issues.apache.org/jira/browse/ARROW-6987 > Project: Apache Arrow > Issue Type: Improvement > Components: Continuous Integration >Reporter: Francois Saint-Jacques >Priority: Major > Labels: pull-request-available > > {code:java} > sudo installer -pkg > /Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10.14.pkg > -target /343installer: Package name is > macOS_SDK_headers_for_macOS_10.14344installer: Certificate used to sign > package is not trusted. Use -allowUntrusted to override.345The command > "$TRAVIS_BUILD_DIR/ci/travis_before_script_cpp.sh --only-library --homebrew" > failed and exited with 1 during . > {code} > See [https://travis-ci.org/apache/arrow/jobs/602434884#L342-L345] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6987) [CI] Travis OSX failing to install sdk headers
Francois Saint-Jacques created ARROW-6987: - Summary: [CI] Travis OSX failing to install sdk headers Key: ARROW-6987 URL: https://issues.apache.org/jira/browse/ARROW-6987 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration Reporter: Francois Saint-Jacques {code:java} sudo installer -pkg /Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10.14.pkg -target /343installer: Package name is macOS_SDK_headers_for_macOS_10.14344installer: Certificate used to sign package is not trusted. Use -allowUntrusted to override.345The command "$TRAVIS_BUILD_DIR/ci/travis_before_script_cpp.sh --only-library --homebrew" failed and exited with 1 during . {code} See [https://travis-ci.org/apache/arrow/jobs/602434884#L342-L345] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6962) [C++] [CI] Stop compiling with -Weverything
[ https://issues.apache.org/jira/browse/ARROW-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-6962: --- Fix Version/s: 0.15.1 > [C++] [CI] Stop compiling with -Weverything > --- > > Key: ARROW-6962 > URL: https://issues.apache.org/jira/browse/ARROW-6962 > Project: Apache Arrow > Issue Type: Bug > Components: C++, Continuous Integration >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0, 0.15.1 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > We should simply use {{-Wall}} instead. > [https://quuxplusone.github.io/blog/2018/12/06/dont-use-weverything/] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6984) [C++] Update LZ4 to 1.9.2 for CVE-2019-17543
[ https://issues.apache.org/jira/browse/ARROW-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-6984: --- Fix Version/s: (was: 0.15.1) 1.0.0 > [C++] Update LZ4 to 1.9.2 for CVE-2019-17543 > > > Key: ARROW-6984 > URL: https://issues.apache.org/jira/browse/ARROW-6984 > Project: Apache Arrow > Issue Type: Wish > Components: C++ >Affects Versions: 0.15.0 >Reporter: Sangeeth Keeriyadath >Assignee: Krisztian Szucs >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > There is a reported CVE that LZ4 before 1.9.2 has a heap-based buffer > overflow in LZ4_write32 (More details in here - > [https://nvd.nist.gov/vuln/detail/CVE-2019-17543] ). I see that Apache Arrow > uses *v1.8.3* version ( > [https://github.com/apache/arrow/blob/47e5ecafa72b70112a64a1174b29b9db45f803ef/cpp/thirdparty/versions.txt#L38] > ). > We need to bump up the dependency version of LZ4 to *1.9.2* to get past the > reported CVE. Thank you! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6986) [R] Add basic Expression class
[ https://issues.apache.org/jira/browse/ARROW-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6986: -- Labels: pull-request-available (was: ) > [R] Add basic Expression class > -- > > Key: ARROW-6986 > URL: https://issues.apache.org/jira/browse/ARROW-6986 > Project: Apache Arrow > Issue Type: New Feature > Components: R >Reporter: Neal Richardson >Assignee: Neal Richardson >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > > I started this as part of ARROW-6980 but it proved not necessary. This will > be a foundation for ARROW-6982, in addition to being useful on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6986) [R] Add basic Expression class
Neal Richardson created ARROW-6986: -- Summary: [R] Add basic Expression class Key: ARROW-6986 URL: https://issues.apache.org/jira/browse/ARROW-6986 Project: Apache Arrow Issue Type: New Feature Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 I started this as part of ARROW-6980 but it proved not necessary. This will be a foundation for ARROW-6982, in addition to being useful on its own. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6907) [C++][Plasma] Allow Plasma store to batch notifications to clients
[ https://issues.apache.org/jira/browse/ARROW-6907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philipp Moritz resolved ARROW-6907. --- Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request 5626 [https://github.com/apache/arrow/pull/5626] > [C++][Plasma] Allow Plasma store to batch notifications to clients > -- > > Key: ARROW-6907 > URL: https://issues.apache.org/jira/browse/ARROW-6907 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ - Plasma >Reporter: Danyang >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6983) [C++] Threaded task group crashes sometimes
[ https://issues.apache.org/jira/browse/ARROW-6983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-6983: --- Fix Version/s: 0.15.1 > [C++] Threaded task group crashes sometimes > --- > > Key: ARROW-6983 > URL: https://issues.apache.org/jira/browse/ARROW-6983 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Neal Richardson >Assignee: Antoine Pitrou >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0, 0.15.1 > > Time Spent: 2h > Remaining Estimate: 0h > > You can give this a more descriptive title :) > See discussion on ARROW-6977. > https://gist.github.com/pitrou/87f3091c226db3306c45b2c32dd9aea8 seems to fix > it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6963) [Packaging][Wheel][OSX] Use crossbow's command to deploy artifacts from travis builds
[ https://issues.apache.org/jira/browse/ARROW-6963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-6963: --- Fix Version/s: 0.15.1 > [Packaging][Wheel][OSX] Use crossbow's command to deploy artifacts from > travis builds > - > > Key: ARROW-6963 > URL: https://issues.apache.org/jira/browse/ARROW-6963 > Project: Apache Arrow > Issue Type: Improvement > Components: Packaging >Reporter: Krisztian Szucs >Assignee: Krisztian Szucs >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0, 0.15.1 > > Time Spent: 1.5h > Remaining Estimate: 0h > > Travis starts to fail more often during artefact deployment to GitHub > releases. > Crossbow has a builtin command to upload the artifacts which is more reliable. > All of the travis builds should use the crossbow script instead of relying on > travis's deployment feature. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6963) [Packaging][Wheel][OSX] Use crossbow's command to deploy artifacts from travis builds
[ https://issues.apache.org/jira/browse/ARROW-6963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs resolved ARROW-6963. Fix Version/s: (was: 0.15.1) Resolution: Fixed Issue resolved by pull request 5726 [https://github.com/apache/arrow/pull/5726] > [Packaging][Wheel][OSX] Use crossbow's command to deploy artifacts from > travis builds > - > > Key: ARROW-6963 > URL: https://issues.apache.org/jira/browse/ARROW-6963 > Project: Apache Arrow > Issue Type: Improvement > Components: Packaging >Reporter: Krisztian Szucs >Assignee: Krisztian Szucs >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > Travis starts to fail more often during artefact deployment to GitHub > releases. > Crossbow has a builtin command to upload the artifacts which is more reliable. > All of the travis builds should use the crossbow script instead of relying on > travis's deployment feature. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6963) [Packaging][Wheel][OSX] Use crossbow's command to deploy artifacts from travis builds
[ https://issues.apache.org/jira/browse/ARROW-6963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-6963: --- Summary: [Packaging][Wheel][OSX] Use crossbow's command to deploy artifacts from travis builds (was: [Packaging] Use crossbow's command to deploy artifacts from travis builds) > [Packaging][Wheel][OSX] Use crossbow's command to deploy artifacts from > travis builds > - > > Key: ARROW-6963 > URL: https://issues.apache.org/jira/browse/ARROW-6963 > Project: Apache Arrow > Issue Type: Improvement > Components: Packaging >Reporter: Krisztian Szucs >Assignee: Krisztian Szucs >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Travis starts to fail more often during artefact deployment to GitHub > releases. > Crossbow has a builtin command to upload the artifacts which is more reliable. > All of the travis builds should use the crossbow script instead of relying on > travis's deployment feature. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-6963) [Packaging] Use crossbow's command to deploy artifacts from travis builds
[ https://issues.apache.org/jira/browse/ARROW-6963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs reassigned ARROW-6963: -- Assignee: Krisztian Szucs > [Packaging] Use crossbow's command to deploy artifacts from travis builds > - > > Key: ARROW-6963 > URL: https://issues.apache.org/jira/browse/ARROW-6963 > Project: Apache Arrow > Issue Type: Improvement > Components: Packaging >Reporter: Krisztian Szucs >Assignee: Krisztian Szucs >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Travis starts to fail more often during artefact deployment to GitHub > releases. > Crossbow has a builtin command to upload the artifacts which is more reliable. > All of the travis builds should use the crossbow script instead of relying on > travis's deployment feature. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6963) [Packaging][Wheel][OSX] Use crossbow's command to deploy artifacts from travis builds
[ https://issues.apache.org/jira/browse/ARROW-6963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-6963: --- Fix Version/s: 0.15.1 1.0.0 > [Packaging][Wheel][OSX] Use crossbow's command to deploy artifacts from > travis builds > - > > Key: ARROW-6963 > URL: https://issues.apache.org/jira/browse/ARROW-6963 > Project: Apache Arrow > Issue Type: Improvement > Components: Packaging >Reporter: Krisztian Szucs >Assignee: Krisztian Szucs >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0, 0.15.1 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Travis starts to fail more often during artefact deployment to GitHub > releases. > Crossbow has a builtin command to upload the artifacts which is more reliable. > All of the travis builds should use the crossbow script instead of relying on > travis's deployment feature. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6983) [C++] Threaded task group crashes sometimes
[ https://issues.apache.org/jira/browse/ARROW-6983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Kietzman resolved ARROW-6983. - Fix Version/s: (was: 0.15.1) Resolution: Fixed Issue resolved by pull request 5724 [https://github.com/apache/arrow/pull/5724] > [C++] Threaded task group crashes sometimes > --- > > Key: ARROW-6983 > URL: https://issues.apache.org/jira/browse/ARROW-6983 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Neal Richardson >Assignee: Antoine Pitrou >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 2h > Remaining Estimate: 0h > > You can give this a more descriptive title :) > See discussion on ARROW-6977. > https://gist.github.com/pitrou/87f3091c226db3306c45b2c32dd9aea8 seems to fix > it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6977) [C++] Only enable jemalloc background_thread if feature is supported
[ https://issues.apache.org/jira/browse/ARROW-6977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-6977: --- Fix Version/s: 0.15.1 > [C++] Only enable jemalloc background_thread if feature is supported > > > Key: ARROW-6977 > URL: https://issues.apache.org/jira/browse/ARROW-6977 > Project: Apache Arrow > Issue Type: Bug > Components: C++ > Environment: macOS 10.14, Homebrew >Reporter: Neal Richardson >Assignee: Antoine Pitrou >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0, 0.15.1 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Followup to ARROW-6910. When loading the R package after that patch merged, I > get this new message: > {code} > $ R > > library(arrow) > : option background_thread currently supports pthread only > {code} > https://github.com/jemalloc/jemalloc/blob/3d84bd57f4954a17059bd31330ec87d3c1876411/src/background_thread.c#L884-L887 > is where the message comes from. Tracing that further, > {{have_background_thread}} comes from > https://github.com/jemalloc/jemalloc/blob/21cfe59ff7b10a61dabe26cd3dbfb7a255e1f5e8/include/jemalloc/internal/jemalloc_preamble.h.in#L205-L211, > which gets set in {{configure.ac}} here: > https://github.com/jemalloc/jemalloc/blob/d2dddfb82aac9f2212922eb90324e84790704bfe/configure.ac#L2155-L2157 > In sum, on my system, that flag doesn't get set, so > {{have_background_thread}} is false, and when that is false and the > {{background_thread}} option is true, I get that message printed. And I do > not want to see that message. > cc [~wesm] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6977) [C++] Only enable jemalloc background_thread if feature is supported
[ https://issues.apache.org/jira/browse/ARROW-6977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-6977. Fix Version/s: (was: 0.15.1) Resolution: Fixed Issue resolved by pull request 5729 [https://github.com/apache/arrow/pull/5729] > [C++] Only enable jemalloc background_thread if feature is supported > > > Key: ARROW-6977 > URL: https://issues.apache.org/jira/browse/ARROW-6977 > Project: Apache Arrow > Issue Type: Bug > Components: C++ > Environment: macOS 10.14, Homebrew >Reporter: Neal Richardson >Assignee: Antoine Pitrou >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Followup to ARROW-6910. When loading the R package after that patch merged, I > get this new message: > {code} > $ R > > library(arrow) > : option background_thread currently supports pthread only > {code} > https://github.com/jemalloc/jemalloc/blob/3d84bd57f4954a17059bd31330ec87d3c1876411/src/background_thread.c#L884-L887 > is where the message comes from. Tracing that further, > {{have_background_thread}} comes from > https://github.com/jemalloc/jemalloc/blob/21cfe59ff7b10a61dabe26cd3dbfb7a255e1f5e8/include/jemalloc/internal/jemalloc_preamble.h.in#L205-L211, > which gets set in {{configure.ac}} here: > https://github.com/jemalloc/jemalloc/blob/d2dddfb82aac9f2212922eb90324e84790704bfe/configure.ac#L2155-L2157 > In sum, on my system, that flag doesn't get set, so > {{have_background_thread}} is false, and when that is false and the > {{background_thread}} option is true, I get that message printed. And I do > not want to see that message. > cc [~wesm] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6985) [Python] Steadily increasing time to load file using read_parquet
[ https://issues.apache.org/jira/browse/ARROW-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-6985: --- Component/s: Python > [Python] Steadily increasing time to load file using read_parquet > - > > Key: ARROW-6985 > URL: https://issues.apache.org/jira/browse/ARROW-6985 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.13.0, 0.14.0, 0.15.0 >Reporter: Casey >Priority: Minor > > I've noticed that reading from parquet using pandas read_parquet function is > taking steadily longer with each invocation. I've seen the other ticket about > memory usage but I'm seeing no memory impact just steadily increasing read > time until I restart the python session. > Below is some code to reproduce my results. I notice it's particularly bad on > wide matrices, especially using pyarrow==0.15.0 > {code:python} > import pyarrow.parquet as pq > import pyarrow as pa > import pandas as pd > import os > import numpy as np > import time > file = "skinny_matrix.pq" > if not os.path.isfile(file): > mat = np.zeros((6000, 26000)) > mat.ravel()[::100] = np.random.randn(60 * 26000) > df = pd.DataFrame(mat.T) > table = pa.Table.from_pandas(df) > pq.write_table(table, file) > n_timings = 50 > timings = np.empty(n_timings) > for i in range(n_timings): > start = time.time() > new_df = pd.read_parquet(file) > end = time.time() > timings[i] = end - start > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6985) [Python] Steadily increasing time to load file using read_parquet
[ https://issues.apache.org/jira/browse/ARROW-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-6985: --- Summary: [Python] Steadily increasing time to load file using read_parquet (was: Steadily increasing time to load file using read_parquet) > [Python] Steadily increasing time to load file using read_parquet > - > > Key: ARROW-6985 > URL: https://issues.apache.org/jira/browse/ARROW-6985 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 0.13.0, 0.14.0, 0.15.0 >Reporter: Casey >Priority: Minor > > I've noticed that reading from parquet using pandas read_parquet function is > taking steadily longer with each invocation. I've seen the other ticket about > memory usage but I'm seeing no memory impact just steadily increasing read > time until I restart the python session. > Below is some code to reproduce my results. I notice it's particularly bad on > wide matrices, especially using pyarrow==0.15.0 > {code:python} > import pyarrow.parquet as pq > import pyarrow as pa > import pandas as pd > import os > import numpy as np > import time > file = "skinny_matrix.pq" > if not os.path.isfile(file): > mat = np.zeros((6000, 26000)) > mat.ravel()[::100] = np.random.randn(60 * 26000) > df = pd.DataFrame(mat.T) > table = pa.Table.from_pandas(df) > pq.write_table(table, file) > n_timings = 50 > timings = np.empty(n_timings) > for i in range(n_timings): > start = time.time() > new_df = pd.read_parquet(file) > end = time.time() > timings[i] = end - start > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6985) Steadily increasing time to load file using read_parquet
[ https://issues.apache.org/jira/browse/ARROW-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-6985: --- Fix Version/s: (was: 0.15.0) (was: 0.14.0) (was: 0.13.0) > Steadily increasing time to load file using read_parquet > > > Key: ARROW-6985 > URL: https://issues.apache.org/jira/browse/ARROW-6985 > Project: Apache Arrow > Issue Type: Bug >Affects Versions: 0.13.0, 0.14.0, 0.15.0 >Reporter: Casey >Priority: Minor > > I've noticed that reading from parquet using pandas read_parquet function is > taking steadily longer with each invocation. I've seen the other ticket about > memory usage but I'm seeing no memory impact just steadily increasing read > time until I restart the python session. > Below is some code to reproduce my results. I notice it's particularly bad on > wide matrices, especially using pyarrow==0.15.0 > {code:python} > import pyarrow.parquet as pq > import pyarrow as pa > import pandas as pd > import os > import numpy as np > import time > file = "skinny_matrix.pq" > if not os.path.isfile(file): > mat = np.zeros((6000, 26000)) > mat.ravel()[::100] = np.random.randn(60 * 26000) > df = pd.DataFrame(mat.T) > table = pa.Table.from_pandas(df) > pq.write_table(table, file) > n_timings = 50 > timings = np.empty(n_timings) > for i in range(n_timings): > start = time.time() > new_df = pd.read_parquet(file) > end = time.time() > timings[i] = end - start > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6340) [R] Implements low-level bindings to Dataset classes
[ https://issues.apache.org/jira/browse/ARROW-6340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-6340: --- Description: The following classes should be accessible from R: * class DataSource * class DataSourceDiscovery * class Dataset * class ScanContext, ScanOptions, ScanTask * class ScannerBuilder * class Scanner The end result is reading a directory of parquet files as a single stream. One should be able to re-implement [https://github.com/apache/arrow/pull/5720] in R. was: The following classes should be accessible from R: * class DataSource * class DataFragment * function DiscoverySource * class ScanContext, ScanOptions, ScanTask * class Dataset * class ScannerBuilder * class Scanner The end result is reading a directory of parquet files as a single stream > [R] Implements low-level bindings to Dataset classes > > > Key: ARROW-6340 > URL: https://issues.apache.org/jira/browse/ARROW-6340 > Project: Apache Arrow > Issue Type: New Feature > Components: R >Reporter: Francois Saint-Jacques >Assignee: Romain Francois >Priority: Major > Labels: dataset, pull-request-available > Fix For: 1.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > The following classes should be accessible from R: > * class DataSource > * class DataSourceDiscovery > * class Dataset > * class ScanContext, ScanOptions, ScanTask > * class ScannerBuilder > * class Scanner > The end result is reading a directory of parquet files as a single stream. > One should be able to re-implement > [https://github.com/apache/arrow/pull/5720] in R. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-6984) [C++] Update LZ4 to 1.9.2 for CVE-2019-17543
[ https://issues.apache.org/jira/browse/ARROW-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs reassigned ARROW-6984: -- Assignee: Krisztian Szucs > [C++] Update LZ4 to 1.9.2 for CVE-2019-17543 > > > Key: ARROW-6984 > URL: https://issues.apache.org/jira/browse/ARROW-6984 > Project: Apache Arrow > Issue Type: Wish > Components: C++ >Affects Versions: 0.15.0 >Reporter: Sangeeth Keeriyadath >Assignee: Krisztian Szucs >Priority: Major > Labels: pull-request-available > Fix For: 0.15.1 > > Time Spent: 20m > Remaining Estimate: 0h > > There is a reported CVE that LZ4 before 1.9.2 has a heap-based buffer > overflow in LZ4_write32 (More details in here - > [https://nvd.nist.gov/vuln/detail/CVE-2019-17543] ). I see that Apache Arrow > uses *v1.8.3* version ( > [https://github.com/apache/arrow/blob/47e5ecafa72b70112a64a1174b29b9db45f803ef/cpp/thirdparty/versions.txt#L38] > ). > We need to bump up the dependency version of LZ4 to *1.9.2* to get past the > reported CVE. Thank you! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-6977) [C++] Only enable jemalloc background_thread if feature is supported
[ https://issues.apache.org/jira/browse/ARROW-6977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou reassigned ARROW-6977: - Assignee: Antoine Pitrou > [C++] Only enable jemalloc background_thread if feature is supported > > > Key: ARROW-6977 > URL: https://issues.apache.org/jira/browse/ARROW-6977 > Project: Apache Arrow > Issue Type: Bug > Components: C++ > Environment: macOS 10.14, Homebrew >Reporter: Neal Richardson >Assignee: Antoine Pitrou >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0, 0.15.1 > > Time Spent: 20m > Remaining Estimate: 0h > > Followup to ARROW-6910. When loading the R package after that patch merged, I > get this new message: > {code} > $ R > > library(arrow) > : option background_thread currently supports pthread only > {code} > https://github.com/jemalloc/jemalloc/blob/3d84bd57f4954a17059bd31330ec87d3c1876411/src/background_thread.c#L884-L887 > is where the message comes from. Tracing that further, > {{have_background_thread}} comes from > https://github.com/jemalloc/jemalloc/blob/21cfe59ff7b10a61dabe26cd3dbfb7a255e1f5e8/include/jemalloc/internal/jemalloc_preamble.h.in#L205-L211, > which gets set in {{configure.ac}} here: > https://github.com/jemalloc/jemalloc/blob/d2dddfb82aac9f2212922eb90324e84790704bfe/configure.ac#L2155-L2157 > In sum, on my system, that flag doesn't get set, so > {{have_background_thread}} is false, and when that is false and the > {{background_thread}} option is true, I get that message printed. And I do > not want to see that message. > cc [~wesm] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6983) [C++] Threaded task group crashes sometimes
[ https://issues.apache.org/jira/browse/ARROW-6983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-6983: -- Fix Version/s: 1.0.0 > [C++] Threaded task group crashes sometimes > --- > > Key: ARROW-6983 > URL: https://issues.apache.org/jira/browse/ARROW-6983 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Neal Richardson >Assignee: Antoine Pitrou >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0, 0.15.1 > > Time Spent: 1.5h > Remaining Estimate: 0h > > You can give this a more descriptive title :) > See discussion on ARROW-6977. > https://gist.github.com/pitrou/87f3091c226db3306c45b2c32dd9aea8 seems to fix > it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6977) [C++] Only enable jemalloc background_thread if feature is supported
[ https://issues.apache.org/jira/browse/ARROW-6977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6977: -- Labels: pull-request-available (was: ) > [C++] Only enable jemalloc background_thread if feature is supported > > > Key: ARROW-6977 > URL: https://issues.apache.org/jira/browse/ARROW-6977 > Project: Apache Arrow > Issue Type: Bug > Components: C++ > Environment: macOS 10.14, Homebrew >Reporter: Neal Richardson >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0, 0.15.1 > > > Followup to ARROW-6910. When loading the R package after that patch merged, I > get this new message: > {code} > $ R > > library(arrow) > : option background_thread currently supports pthread only > {code} > https://github.com/jemalloc/jemalloc/blob/3d84bd57f4954a17059bd31330ec87d3c1876411/src/background_thread.c#L884-L887 > is where the message comes from. Tracing that further, > {{have_background_thread}} comes from > https://github.com/jemalloc/jemalloc/blob/21cfe59ff7b10a61dabe26cd3dbfb7a255e1f5e8/include/jemalloc/internal/jemalloc_preamble.h.in#L205-L211, > which gets set in {{configure.ac}} here: > https://github.com/jemalloc/jemalloc/blob/d2dddfb82aac9f2212922eb90324e84790704bfe/configure.ac#L2155-L2157 > In sum, on my system, that flag doesn't get set, so > {{have_background_thread}} is false, and when that is false and the > {{background_thread}} option is true, I get that message printed. And I do > not want to see that message. > cc [~wesm] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6961) [C++][Gandiva] Add lower_utf8 function in Gandiva
[ https://issues.apache.org/jira/browse/ARROW-6961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-6961. --- Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request 5712 [https://github.com/apache/arrow/pull/5712] > [C++][Gandiva] Add lower_utf8 function in Gandiva > - > > Key: ARROW-6961 > URL: https://issues.apache.org/jira/browse/ARROW-6961 > Project: Apache Arrow > Issue Type: Task > Components: C++ - Gandiva >Reporter: Prudhvi Porandla >Assignee: Prudhvi Porandla >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > Function signature is {{utf8 lower(utf8)}}. Converts an utf8 sequence to > lower case. > This handles only ascii characters. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6984) [C++] Update LZ4 to 1.9.2 for CVE-2019-17543
[ https://issues.apache.org/jira/browse/ARROW-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6984: -- Labels: pull-request-available (was: ) > [C++] Update LZ4 to 1.9.2 for CVE-2019-17543 > > > Key: ARROW-6984 > URL: https://issues.apache.org/jira/browse/ARROW-6984 > Project: Apache Arrow > Issue Type: Wish > Components: C++ >Affects Versions: 0.15.0 >Reporter: Sangeeth Keeriyadath >Priority: Major > Labels: pull-request-available > Fix For: 0.15.1 > > > There is a reported CVE that LZ4 before 1.9.2 has a heap-based buffer > overflow in LZ4_write32 (More details in here - > [https://nvd.nist.gov/vuln/detail/CVE-2019-17543] ). I see that Apache Arrow > uses *v1.8.3* version ( > [https://github.com/apache/arrow/blob/47e5ecafa72b70112a64a1174b29b9db45f803ef/cpp/thirdparty/versions.txt#L38] > ). > We need to bump up the dependency version of LZ4 to *1.9.2* to get past the > reported CVE. Thank you! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6984) [C++] Update LZ4 to 1.9.2 for CVE-2019-17543
[ https://issues.apache.org/jira/browse/ARROW-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-6984: --- Summary: [C++] Update LZ4 to 1.9.2 for CVE-2019-17543 (was: Update LZ4 to 1.9.2 for CVE-2019-17543) > [C++] Update LZ4 to 1.9.2 for CVE-2019-17543 > > > Key: ARROW-6984 > URL: https://issues.apache.org/jira/browse/ARROW-6984 > Project: Apache Arrow > Issue Type: Wish > Components: C++ >Affects Versions: 0.15.0 >Reporter: Sangeeth Keeriyadath >Priority: Major > Fix For: 0.15.1 > > > There is a reported CVE that LZ4 before 1.9.2 has a heap-based buffer > overflow in LZ4_write32 (More details in here - > [https://nvd.nist.gov/vuln/detail/CVE-2019-17543] ). I see that Apache Arrow > uses *v1.8.3* version ( > [https://github.com/apache/arrow/blob/47e5ecafa72b70112a64a1174b29b9db45f803ef/cpp/thirdparty/versions.txt#L38] > ). > We need to bump up the dependency version of LZ4 to *1.9.2* to get past the > reported CVE. Thank you! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6825) [C++] Rework CSV reader IO around readahead iterator
[ https://issues.apache.org/jira/browse/ARROW-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6825: -- Labels: pull-request-available (was: ) > [C++] Rework CSV reader IO around readahead iterator > > > Key: ARROW-6825 > URL: https://issues.apache.org/jira/browse/ARROW-6825 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Major > Labels: pull-request-available > > Following ARROW-6764, we should try to remove the custom ReadaheadSpooler and > use the generic readahead iteration facility instead. This will require > reworking the blocking / chunking logic to mimick what is done in the JSON > reader. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-6952) [C++][Dataset] Ensure expression filter is passed ParquetDataFragment
[ https://issues.apache.org/jira/browse/ARROW-6952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Kietzman reassigned ARROW-6952: --- Assignee: Ben Kietzman > [C++][Dataset] Ensure expression filter is passed ParquetDataFragment > - > > Key: ARROW-6952 > URL: https://issues.apache.org/jira/browse/ARROW-6952 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Francois Saint-Jacques >Assignee: Ben Kietzman >Priority: Major > Labels: dataset > > We should be able to prune RowGroups based on the expression and the > statistics. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (ARROW-6299) [C++] Simplify FileFormat classes to singletons
[ https://issues.apache.org/jira/browse/ARROW-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Kietzman closed ARROW-6299. --- Resolution: Won't Do > [C++] Simplify FileFormat classes to singletons > --- > > Key: ARROW-6299 > URL: https://issues.apache.org/jira/browse/ARROW-6299 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Ben Kietzman >Assignee: Ben Kietzman >Priority: Minor > Labels: dataset > Fix For: 1.0.0 > > > ParquetFileFormat has no state, so passing it around by > shared_ptr is not necessary; we could just keep a single static > instance and pass raw pointers. > [~wesmckinn] is there a case where a FileFormat might have state? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6299) [C++] Simplify FileFormat classes to singletons
[ https://issues.apache.org/jira/browse/ARROW-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16958951#comment-16958951 ] Ben Kietzman commented on ARROW-6299: - In the case of CSV, it's most natural to consider files with a comma separator and files with a tab separator as different formats. > [C++] Simplify FileFormat classes to singletons > --- > > Key: ARROW-6299 > URL: https://issues.apache.org/jira/browse/ARROW-6299 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Ben Kietzman >Assignee: Ben Kietzman >Priority: Minor > Labels: dataset > Fix For: 1.0.0 > > > ParquetFileFormat has no state, so passing it around by > shared_ptr is not necessary; we could just keep a single static > instance and pass raw pointers. > [~wesmckinn] is there a case where a FileFormat might have state? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6964) [C++][Dataset] Expose a nested parallel option for Scanner::ToTable
[ https://issues.apache.org/jira/browse/ARROW-6964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Kietzman updated ARROW-6964: Summary: [C++][Dataset] Expose a nested parallel option for Scanner::ToTable (was: [C++][Dataset] Expose a nested parellel option for Scanner::ToTable) > [C++][Dataset] Expose a nested parallel option for Scanner::ToTable > --- > > Key: ARROW-6964 > URL: https://issues.apache.org/jira/browse/ARROW-6964 > Project: Apache Arrow > Issue Type: Improvement >Reporter: Francois Saint-Jacques >Assignee: Francois Saint-Jacques >Priority: Major > Labels: dataset, pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6985) Steadily increasing time to load file using read_parquet
Casey created ARROW-6985: Summary: Steadily increasing time to load file using read_parquet Key: ARROW-6985 URL: https://issues.apache.org/jira/browse/ARROW-6985 Project: Apache Arrow Issue Type: Bug Affects Versions: 0.15.0, 0.14.0, 0.13.0 Reporter: Casey Fix For: 0.15.0, 0.14.0, 0.13.0 I've noticed that reading from parquet using pandas read_parquet function is taking steadily longer with each invocation. I've seen the other ticket about memory usage but I'm seeing no memory impact just steadily increasing read time until I restart the python session. Below is some code to reproduce my results. I notice it's particularly bad on wide matrices, especially using pyarrow==0.15.0 {code:python} import pyarrow.parquet as pq import pyarrow as pa import pandas as pd import os import numpy as np import time file = "skinny_matrix.pq" if not os.path.isfile(file): mat = np.zeros((6000, 26000)) mat.ravel()[::100] = np.random.randn(60 * 26000) df = pd.DataFrame(mat.T) table = pa.Table.from_pandas(df) pq.write_table(table, file) n_timings = 50 timings = np.empty(n_timings) for i in range(n_timings): start = time.time() new_df = pd.read_parquet(file) end = time.time() timings[i] = end - start {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6963) [Packaging] Use crossbow's command to deploy artifacts from travis builds
[ https://issues.apache.org/jira/browse/ARROW-6963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6963: -- Labels: pull-request-available (was: ) > [Packaging] Use crossbow's command to deploy artifacts from travis builds > - > > Key: ARROW-6963 > URL: https://issues.apache.org/jira/browse/ARROW-6963 > Project: Apache Arrow > Issue Type: Improvement > Components: Packaging >Reporter: Krisztian Szucs >Priority: Major > Labels: pull-request-available > > Travis starts to fail more often during artefact deployment to GitHub > releases. > Crossbow has a builtin command to upload the artifacts which is more reliable. > All of the travis builds should use the crossbow script instead of relying on > travis's deployment feature. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6704) [C++] Cast from timestamp to higher resolution does not check out of bounds timestamps
[ https://issues.apache.org/jira/browse/ARROW-6704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-6704. --- Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request 5623 [https://github.com/apache/arrow/pull/5623] > [C++] Cast from timestamp to higher resolution does not check out of bounds > timestamps > -- > > Key: ARROW-6704 > URL: https://issues.apache.org/jira/browse/ARROW-6704 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Joris Van den Bossche >Assignee: Joris Van den Bossche >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > When casting eg {{timestamp('s')}} to {{timestamp('ns')}}, we do not check > for out of bounds timestamps, giving "garbage" timestamps in the result: > {code} > In [74]: a_np = np.array(["2012-01-01", "2412-01-01"], dtype="datetime64[s]") > > > In [75]: arr = pa.array(a_np) > > > In [76]: arr > > > Out[76]: > > [ > 2012-01-01 00:00:00, > 2412-01-01 00:00:00 > ] > In [77]: arr.cast(pa.timestamp('ns')) > > > Out[77]: > > [ > 2012-01-01 00:00:00.0, > 1827-06-13 00:25:26.290448384 > ] > {code} > Now, this is the same behaviour as numpy, so not sure we should do this. > However, since we have a {{safe=True/False}}, I would expect that for > {{safe=True}} we check this and for {{safe=False}} we do not check this. > (numpy has a similiar {{casting='safe'}} but also does not raise an error in > that case). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6341) [Python] Implement low-level bindings for Dataset
[ https://issues.apache.org/jira/browse/ARROW-6341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-6341: -- Description: The following classes should be accessible from Python: * class DataSource * class DataSourceDiscovery * class Dataset * class ScanContext, ScanOptions, ScanTask * class ScannerBuilder * class Scanner The end result is reading a directory of parquet files as a single stream. One should be able to re-implement [https://github.com/apache/arrow/pull/5720] in python. was: The following classes should be accessible from Python: * class DataSource * class DataFragment * function DiscoverySource * class ScanContext, ScanOptions, ScanTask * class Dataset * class ScannerBuilder * class Scanner The end result is reading a directory of parquet files as a single stream. > [Python] Implement low-level bindings for Dataset > - > > Key: ARROW-6341 > URL: https://issues.apache.org/jira/browse/ARROW-6341 > Project: Apache Arrow > Issue Type: New Feature > Components: Python >Reporter: Francois Saint-Jacques >Assignee: Krisztian Szucs >Priority: Major > Labels: dataset, pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > The following classes should be accessible from Python: > * class DataSource > * class DataSourceDiscovery > * class Dataset > * class ScanContext, ScanOptions, ScanTask > * class ScannerBuilder > * class Scanner > The end result is reading a directory of parquet files as a single stream. > One should be able to re-implement > [https://github.com/apache/arrow/pull/5720] in python. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6984) Update LZ4 to 1.9.2 for CVE-2019-17543
Sangeeth Keeriyadath created ARROW-6984: --- Summary: Update LZ4 to 1.9.2 for CVE-2019-17543 Key: ARROW-6984 URL: https://issues.apache.org/jira/browse/ARROW-6984 Project: Apache Arrow Issue Type: Wish Components: C++ Affects Versions: 0.15.0 Reporter: Sangeeth Keeriyadath Fix For: 0.15.1 There is a reported CVE that LZ4 before 1.9.2 has a heap-based buffer overflow in LZ4_write32 (More details in here - [https://nvd.nist.gov/vuln/detail/CVE-2019-17543] ). I see that Apache Arrow uses *v1.8.3* version ( [https://github.com/apache/arrow/blob/47e5ecafa72b70112a64a1174b29b9db45f803ef/cpp/thirdparty/versions.txt#L38] ). We need to bump up the dependency version of LZ4 to *1.9.2* to get past the reported CVE. Thank you! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6970) [Packaging][RPM] Add support for CentOS 8
[ https://issues.apache.org/jira/browse/ARROW-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques resolved ARROW-6970. --- Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request 5715 [https://github.com/apache/arrow/pull/5715] > [Packaging][RPM] Add support for CentOS 8 > - > > Key: ARROW-6970 > URL: https://issues.apache.org/jira/browse/ARROW-6970 > Project: Apache Arrow > Issue Type: Improvement > Components: Packaging >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 4h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6948) [Rust] [Parquet] Fix bool array support in arrow reader.
[ https://issues.apache.org/jira/browse/ARROW-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs resolved ARROW-6948. Resolution: Fixed Issue resolved by pull request 5705 [https://github.com/apache/arrow/pull/5705] > [Rust] [Parquet] Fix bool array support in arrow reader. > > > Key: ARROW-6948 > URL: https://issues.apache.org/jira/browse/ARROW-6948 > Project: Apache Arrow > Issue Type: Bug > Components: Rust >Reporter: Renjie Liu >Assignee: Renjie Liu >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6969) [C++][Dataset] ParquetScanTask eagerly load file
[ https://issues.apache.org/jira/browse/ARROW-6969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6969: -- Labels: dataset pull-request-available (was: dataset) > [C++][Dataset] ParquetScanTask eagerly load file > - > > Key: ARROW-6969 > URL: https://issues.apache.org/jira/browse/ARROW-6969 > Project: Apache Arrow > Issue Type: Improvement >Reporter: Francois Saint-Jacques >Assignee: Francois Saint-Jacques >Priority: Major > Labels: dataset, pull-request-available > > The file content should only be read when invoking ParquetScanTask::Scan, not > on construction. This blocks reading in a true streaming fashion with memory > constraints. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6925) [C++] Arrow fails to buld on MacOS 10.13.6 using brew gcc 7 and 8
[ https://issues.apache.org/jira/browse/ARROW-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16958862#comment-16958862 ] Francois Saint-Jacques commented on ARROW-6925: --- Noted, I found out how to do it from now on. > [C++] Arrow fails to buld on MacOS 10.13.6 using brew gcc 7 and 8 > - > > Key: ARROW-6925 > URL: https://issues.apache.org/jira/browse/ARROW-6925 > Project: Apache Arrow > Issue Type: Bug > Components: C++ > Environment: MacOS 10.13.6 using both brew gcc 7 and 8. >Reporter: John Norris >Assignee: John Norris >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Both SetupCxxFlags.cmake and ThirdpartyToolchain.cmake add -stdlib=libc++ to > the compiler flags when APPLE is true, but if you're using GCC from brew (or > presumably from anywhere other that Apple), this flag is not recognized and > the build fails. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6949) [Java] Fix promotable write to handle nullvectors
[ https://issues.apache.org/jira/browse/ARROW-6949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Praveen Kumar resolved ARROW-6949. -- Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request 5698 [https://github.com/apache/arrow/pull/5698] > [Java] Fix promotable write to handle nullvectors > - > > Key: ARROW-6949 > URL: https://issues.apache.org/jira/browse/ARROW-6949 > Project: Apache Arrow > Issue Type: Task > Components: Java >Reporter: Prudhvi Porandla >Assignee: Prudhvi Porandla >Priority: Minor > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6983) [C++] Threaded task group crashes sometimes
[ https://issues.apache.org/jira/browse/ARROW-6983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6983: -- Labels: pull-request-available (was: ) > [C++] Threaded task group crashes sometimes > --- > > Key: ARROW-6983 > URL: https://issues.apache.org/jira/browse/ARROW-6983 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Neal Richardson >Assignee: Antoine Pitrou >Priority: Major > Labels: pull-request-available > Fix For: 0.15.1 > > > You can give this a more descriptive title :) > See discussion on ARROW-6977. > https://gist.github.com/pitrou/87f3091c226db3306c45b2c32dd9aea8 seems to fix > it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6944) [Rust] Add StringType
[ https://issues.apache.org/jira/browse/ARROW-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6944: -- Labels: pull-request-available (was: ) > [Rust] Add StringType > - > > Key: ARROW-6944 > URL: https://issues.apache.org/jira/browse/ARROW-6944 > Project: Apache Arrow > Issue Type: Sub-task > Components: Rust >Reporter: Neville Dipale >Assignee: Neville Dipale >Priority: Major > Labels: pull-request-available > > Create a separate String type which uses UTF8, and restrict the BinaryArray > to opaque binary data -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6948) [Rust] [Parquet] Fix bool array support in arrow reader.
[ https://issues.apache.org/jira/browse/ARROW-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neville Dipale updated ARROW-6948: -- Fix Version/s: 1.0.0 > [Rust] [Parquet] Fix bool array support in arrow reader. > > > Key: ARROW-6948 > URL: https://issues.apache.org/jira/browse/ARROW-6948 > Project: Apache Arrow > Issue Type: Bug > Components: Rust >Reporter: Renjie Liu >Assignee: Renjie Liu >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6948) [Rust] [Parquet] Fix bool array support in arrow reader.
[ https://issues.apache.org/jira/browse/ARROW-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neville Dipale updated ARROW-6948: -- Component/s: Rust > [Rust] [Parquet] Fix bool array support in arrow reader. > > > Key: ARROW-6948 > URL: https://issues.apache.org/jira/browse/ARROW-6948 > Project: Apache Arrow > Issue Type: Bug > Components: Rust >Reporter: Renjie Liu >Assignee: Renjie Liu >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)