[jira] [Commented] (ARROW-6485) [Format][C++]Support the format of a COO sparse matrix that has separated row and column indices
[ https://issues.apache.org/jira/browse/ARROW-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17459900#comment-17459900 ] Kenta Murata commented on ARROW-6485: - [~apitrou] I’m still interested in this feature but I suspect whether people want Apache Arrow to have this feature because nobody have never mentioned this feature. > [Format][C++]Support the format of a COO sparse matrix that has separated row > and column indices > > > Key: ARROW-6485 > URL: https://issues.apache.org/jira/browse/ARROW-6485 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Format >Reporter: Kenta Murata >Assignee: Kenta Murata >Priority: Major > Labels: pull-request-available > Fix For: 8.0.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > For supporting non-copy interchanging of scipy.sparse.coo_matrix, I'd like to > add the new format of a COO matrix that has separated row and column indices. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ARROW-14518) [Ruby] ArrayBuilder doesn't work correctly with Decimal
[ https://issues.apache.org/jira/browse/ARROW-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17436613#comment-17436613 ] Kenta Murata commented on ARROW-14518: -- {quote}Could you add BigDecimal#scale?{quote} Yes, I'll add it. But the new property will be available only with the latest {{bigdecimal}} gem. To support older versions of {{bigdecimal}} gems, we need to detect the existence of {{BigDecimal#scale}} and emulate it when it is absent. > [Ruby] ArrayBuilder doesn't work correctly with Decimal > --- > > Key: ARROW-14518 > URL: https://issues.apache.org/jira/browse/ARROW-14518 > Project: Apache Arrow > Issue Type: Bug > Components: Ruby >Reporter: Kanstantsin Ilchanka >Priority: Minor > > When trying to convert raw data with decimal values to Arrow::Table error > received > > {code:java} > Arrow::Table.new(x: [BigDecimal('1.1')]) > ArgumentError: wrong arguments: Arrow::Decimal128ArrayBuilder#initialize(): > available signatures: (data_type: > interface(Arrow::Decimal128DataType(GArrowDecimal128DataType))) > {code} > I guess this is because Decimal128ArrayBuilder expects Decimal128DataType in > initialiser, however I'm not sure how to correctly and effectively detect > precision and scale from array of BigDecimal > > {code:java} > Arrow::VERSION > => "5.0.0"{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-14518) [Ruby] ArrayBuilder doesn't work correctly with Decimal
[ https://issues.apache.org/jira/browse/ARROW-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17436611#comment-17436611 ] Kenta Murata commented on ARROW-14518: -- Arrow's decimal number is completely different from {{BigDecimal}}. The former is a fixed-point number whereas the latter is a floating-point number. The best approach is introducing the new fixed-point number system for Arrow's decimal numbers on the Ruby side. The second-best approach for me is letting {{Arrow::Decimal128ArrayBuilder}} support arbitrary precisions and scales. That can be done by pooling {{BigDecimal}} values. I's OK to add the new property in {{BigDecimal}} to obtain the number of digits following the decimal dot for assisting the latter case. Maybe, the suitable name of this property is {{BigDecimal#scale}}. > [Ruby] ArrayBuilder doesn't work correctly with Decimal > --- > > Key: ARROW-14518 > URL: https://issues.apache.org/jira/browse/ARROW-14518 > Project: Apache Arrow > Issue Type: Bug > Components: Ruby >Reporter: Kanstantsin Ilchanka >Priority: Minor > > When trying to convert raw data with decimal values to Arrow::Table error > received > > {code:java} > Arrow::Table.new(x: [BigDecimal('1.1')]) > ArgumentError: wrong arguments: Arrow::Decimal128ArrayBuilder#initialize(): > available signatures: (data_type: > interface(Arrow::Decimal128DataType(GArrowDecimal128DataType))) > {code} > I guess this is because Decimal128ArrayBuilder expects Decimal128DataType in > initialiser, however I'm not sure how to correctly and effectively detect > precision and scale from array of BigDecimal > > {code:java} > Arrow::VERSION > => "5.0.0"{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (ARROW-14518) [Ruby] ArrayBuilder doesn't work correctly with Decimal
[ https://issues.apache.org/jira/browse/ARROW-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17436352#comment-17436352 ] Kenta Murata edited comment on ARROW-14518 at 10/30/21, 5:39 PM: - A {{BigDecimal}} number manages only its precision in its decimal notation. We can get the precision of a {{BigDecimal}} number by {{BigDecimal#precision}} method. {{irb(main):001:0> BigDecimal("1.1").precision}} {{=> 2}} But, the {{BigDecimal#precision}} method doesn't count the trailing zeros. {{irb(main):002:0> BigDecimal("1.10").precision}} {{=> 2}} The reason why a {{BigDecimal}} number doesn't have the scale property may be a {{BigDecimal}} number isn't a fixed-precision number. was (Author: mrkn): A {{BigDecimal}} number manages only its precision in its decimal notation. We can get the precision of a {{BigDecimal}} number by {{BigDecimal#precision}} method. {{irb(main):001:0> BigDecimal("1.1").precision}} {{=> 2}} But, the {{BigDecimal#precision}} method does not count the trailing zeros. {{irb(main):002:0> BigDecimal("1.10").precision}} {{=> 2}} The reason why a {{BigDecimal}} number doesn't have the scale property may be a {{BigDecimal}} number isn't a fixed-precision number. > [Ruby] ArrayBuilder doesn't work correctly with Decimal > --- > > Key: ARROW-14518 > URL: https://issues.apache.org/jira/browse/ARROW-14518 > Project: Apache Arrow > Issue Type: Bug > Components: Ruby >Reporter: Kanstantsin Ilchanka >Priority: Minor > > When trying to convert raw data with decimal values to Arrow::Table error > received > > {code:java} > Arrow::Table.new(x: [BigDecimal('1.1')]) > ArgumentError: wrong arguments: Arrow::Decimal128ArrayBuilder#initialize(): > available signatures: (data_type: > interface(Arrow::Decimal128DataType(GArrowDecimal128DataType))) > {code} > I guess this is because Decimal128ArrayBuilder expects Decimal128DataType in > initialiser, however I'm not sure how to correctly and effectively detect > precision and scale from array of BigDecimal > > {code:java} > Arrow::VERSION > => "5.0.0"{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-14518) [Ruby] ArrayBuilder doesn't work correctly with Decimal
[ https://issues.apache.org/jira/browse/ARROW-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17436352#comment-17436352 ] Kenta Murata commented on ARROW-14518: -- A {{BigDecimal}} number manages only its precision in its decimal notation. We can get the precision of a {{BigDecimal}} number by {{BigDecimal#precision}} method. {{irb(main):001:0> BigDecimal("1.1").precision}} {{=> 2}} But, a {{BigDecimal#precision}} does not count the trailing zeros. {{irb(main):002:0> BigDecimal("1.10").precision}} {{=> 2}} The reason why a {{BigDecimal}} number doesn't have the scale property may be a {{BigDecimal}} number isn't a fixed-precision number. > [Ruby] ArrayBuilder doesn't work correctly with Decimal > --- > > Key: ARROW-14518 > URL: https://issues.apache.org/jira/browse/ARROW-14518 > Project: Apache Arrow > Issue Type: Bug > Components: Ruby >Reporter: Kanstantsin Ilchanka >Priority: Minor > > When trying to convert raw data with decimal values to Arrow::Table error > received > > {code:java} > Arrow::Table.new(x: [BigDecimal('1.1')]) > ArgumentError: wrong arguments: Arrow::Decimal128ArrayBuilder#initialize(): > available signatures: (data_type: > interface(Arrow::Decimal128DataType(GArrowDecimal128DataType))) > {code} > I guess this is because Decimal128ArrayBuilder expects Decimal128DataType in > initialiser, however I'm not sure how to correctly and effectively detect > precision and scale from array of BigDecimal > > {code:java} > Arrow::VERSION > => "5.0.0"{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (ARROW-14518) [Ruby] ArrayBuilder doesn't work correctly with Decimal
[ https://issues.apache.org/jira/browse/ARROW-14518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17436352#comment-17436352 ] Kenta Murata edited comment on ARROW-14518 at 10/30/21, 5:38 PM: - A {{BigDecimal}} number manages only its precision in its decimal notation. We can get the precision of a {{BigDecimal}} number by {{BigDecimal#precision}} method. {{irb(main):001:0> BigDecimal("1.1").precision}} {{=> 2}} But, the {{BigDecimal#precision}} method does not count the trailing zeros. {{irb(main):002:0> BigDecimal("1.10").precision}} {{=> 2}} The reason why a {{BigDecimal}} number doesn't have the scale property may be a {{BigDecimal}} number isn't a fixed-precision number. was (Author: mrkn): A {{BigDecimal}} number manages only its precision in its decimal notation. We can get the precision of a {{BigDecimal}} number by {{BigDecimal#precision}} method. {{irb(main):001:0> BigDecimal("1.1").precision}} {{=> 2}} But, a {{BigDecimal#precision}} does not count the trailing zeros. {{irb(main):002:0> BigDecimal("1.10").precision}} {{=> 2}} The reason why a {{BigDecimal}} number doesn't have the scale property may be a {{BigDecimal}} number isn't a fixed-precision number. > [Ruby] ArrayBuilder doesn't work correctly with Decimal > --- > > Key: ARROW-14518 > URL: https://issues.apache.org/jira/browse/ARROW-14518 > Project: Apache Arrow > Issue Type: Bug > Components: Ruby >Reporter: Kanstantsin Ilchanka >Priority: Minor > > When trying to convert raw data with decimal values to Arrow::Table error > received > > {code:java} > Arrow::Table.new(x: [BigDecimal('1.1')]) > ArgumentError: wrong arguments: Arrow::Decimal128ArrayBuilder#initialize(): > available signatures: (data_type: > interface(Arrow::Decimal128DataType(GArrowDecimal128DataType))) > {code} > I guess this is because Decimal128ArrayBuilder expects Decimal128DataType in > initialiser, however I'm not sure how to correctly and effectively detect > precision and scale from array of BigDecimal > > {code:java} > Arrow::VERSION > => "5.0.0"{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-11974) [GLib] Add CsvFragmentScanOption support
Kenta Murata created ARROW-11974: Summary: [GLib] Add CsvFragmentScanOption support Key: ARROW-11974 URL: https://issues.apache.org/jira/browse/ARROW-11974 Project: Apache Arrow Issue Type: New Feature Components: GLib Reporter: Kenta Murata -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-11850) [GLib] GARROW_VERSION_0_16 macro is missing
[ https://issues.apache.org/jira/browse/ARROW-11850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenta Murata resolved ARROW-11850. -- Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 9623 [https://github.com/apache/arrow/pull/9623] > [GLib] GARROW_VERSION_0_16 macro is missing > --- > > Key: ARROW-11850 > URL: https://issues.apache.org/jira/browse/ARROW-11850 > Project: Apache Arrow > Issue Type: Bug > Components: GLib >Reporter: Kenta Murata >Assignee: Kenta Murata >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > {{GARROW_VERSION_0_16}} macro is missing in arrow-glib/version.h. > The absence of this macro occurs the warning like below: > {code} > compiling ../../../../ext/arrow-nmatrix/arrow-nmatrix.c > In file included from /opt/arrow-dbg/include/arrow-glib/arrow-glib.h:23, > from ../../../../ext/arrow-nmatrix/arrow-nmatrix.c:17: > /opt/arrow-dbg/include/arrow-glib/version.h:297:36: warning: > "GARROW_VERSION_0_16" is not defined, evaluates to 0 [-Wundef] > 297 | #if GARROW_VERSION_MIN_REQUIRED >= GARROW_VERSION_0_16 > |^~~ > /opt/arrow-dbg/include/arrow-glib/version.h:305:34: warning: > "GARROW_VERSION_0_16" is not defined, evaluates to 0 [-Wundef] > 305 | #if GARROW_VERSION_MAX_ALLOWED < GARROW_VERSION_0_16 > | ^~~ > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-11850) [GLib] GARROW_VERSION_0_16 macro is missing
Kenta Murata created ARROW-11850: Summary: [GLib] GARROW_VERSION_0_16 macro is missing Key: ARROW-11850 URL: https://issues.apache.org/jira/browse/ARROW-11850 Project: Apache Arrow Issue Type: Bug Components: GLib Reporter: Kenta Murata Assignee: Kenta Murata {{GARROW_VERSION_0_16}} macro is missing in arrow-glib/version.h. The absence of this macro occurs the warning like below: {code} compiling ../../../../ext/arrow-nmatrix/arrow-nmatrix.c In file included from /opt/arrow-dbg/include/arrow-glib/arrow-glib.h:23, from ../../../../ext/arrow-nmatrix/arrow-nmatrix.c:17: /opt/arrow-dbg/include/arrow-glib/version.h:297:36: warning: "GARROW_VERSION_0_16" is not defined, evaluates to 0 [-Wundef] 297 | #if GARROW_VERSION_MIN_REQUIRED >= GARROW_VERSION_0_16 |^~~ /opt/arrow-dbg/include/arrow-glib/version.h:305:34: warning: "GARROW_VERSION_0_16" is not defined, evaluates to 0 [-Wundef] 305 | #if GARROW_VERSION_MAX_ALLOWED < GARROW_VERSION_0_16 | ^~~ {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-11782) [GLib][Dataset] Remove bindings for internal classes
[ https://issues.apache.org/jira/browse/ARROW-11782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenta Murata reassigned ARROW-11782: Assignee: Kenta Murata > [GLib][Dataset] Remove bindings for internal classes > > > Key: ARROW-11782 > URL: https://issues.apache.org/jira/browse/ARROW-11782 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Affects Versions: 3.0.0 >Reporter: Ben Kietzman >Assignee: Kenta Murata >Priority: Major > Fix For: 4.0.0 > > > GLib and ruby include bindings for internal classes such as ScanOptions, > ScanContext, InMemoryScanTask, ScanTask, ... These are probably unnecessary > and should be removed to present a simpler interface less prone to breakage > under refactoring of the wrapped classes > https://github.com/apache/arrow/pull/9532/checks?check_run_id=1974229719#step:8:2071 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-11686) [C++]flight-test-integration-client sometimes exits by SIGABRT but does not print the stack trace
Kenta Murata created ARROW-11686: Summary: [C++]flight-test-integration-client sometimes exits by SIGABRT but does not print the stack trace Key: ARROW-11686 URL: https://issues.apache.org/jira/browse/ARROW-11686 Project: Apache Arrow Issue Type: Test Components: C++ Reporter: Kenta Murata Assignee: Kenta Murata I found that flight-test-integration-client sometimes exits by SIGABRT. This problem has been caused at the commit https://github.com/apache/arrow/commit/848c803bf162dca2e31cb63fbb3d2f9dbdda460e on the master branch, but the change in the commit seems unrelated to this problem. To investigate this problem, I would like to let this command show the stack trace when it exits by SIGABRT. It should be done by calling {{ArrowLog::InstallFailureSignalHandler}} function in the main function. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-11685) [C++] Typo in future_test.cc
Kenta Murata created ARROW-11685: Summary: [C++] Typo in future_test.cc Key: ARROW-11685 URL: https://issues.apache.org/jira/browse/ARROW-11685 Project: Apache Arrow Issue Type: Task Components: C++ Reporter: Kenta Murata Assignee: Kenta Murata FutureStessTest in future_test.cc should be FutureStressTest. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-11470) [C++] Overflow occurs on integer multiplications in ComputeRowMajorStrides, ComputeColumnMajorStrides, and CheckTensorStridesValidity
[ https://issues.apache.org/jira/browse/ARROW-11470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenta Murata updated ARROW-11470: - Description: OSS-Fuzz reports the integer multiplication in ComputeRowMajorStrides function occurs overflow. https://oss-fuzz.com/testcase-detail/623225726408 The same issue exists in ComputeColumnMajorStrides. Moreover the similar overflow issue is occurred in CalculateValueOffset function called from CheckTensorStridesValidity function. https://oss-fuzz.com/testcase-detail/6583463383793664 was: OSS-Fuzz reports the integer multiplication in ComputeRowMajorStrides function occurs overflow. https://oss-fuzz.com/testcase-detail/623225726408 The same issue exists in ComputeColumnMajorStrides. > [C++] Overflow occurs on integer multiplications in ComputeRowMajorStrides, > ComputeColumnMajorStrides, and CheckTensorStridesValidity > - > > Key: ARROW-11470 > URL: https://issues.apache.org/jira/browse/ARROW-11470 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Kenta Murata >Assignee: Kenta Murata >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > OSS-Fuzz reports the integer multiplication in ComputeRowMajorStrides > function occurs overflow. > https://oss-fuzz.com/testcase-detail/623225726408 > The same issue exists in ComputeColumnMajorStrides. > Moreover the similar overflow issue is occurred in CalculateValueOffset > function called from CheckTensorStridesValidity function. > https://oss-fuzz.com/testcase-detail/6583463383793664 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-11470) [C++] Overflow occurs on integer multiplications in ComputeRowMajorStrides, ComputeColumnMajorStrides, and CheckTensorStridesValidity
[ https://issues.apache.org/jira/browse/ARROW-11470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenta Murata updated ARROW-11470: - Summary: [C++] Overflow occurs on integer multiplications in ComputeRowMajorStrides, ComputeColumnMajorStrides, and CheckTensorStridesValidity (was: [C++] Overflow occurs on integer multiplications in Compute(Row|Column)MajorStrides) > [C++] Overflow occurs on integer multiplications in ComputeRowMajorStrides, > ComputeColumnMajorStrides, and CheckTensorStridesValidity > - > > Key: ARROW-11470 > URL: https://issues.apache.org/jira/browse/ARROW-11470 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Kenta Murata >Assignee: Kenta Murata >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > OSS-Fuzz reports the integer multiplication in ComputeRowMajorStrides > function occurs overflow. > https://oss-fuzz.com/testcase-detail/623225726408 > The same issue exists in ComputeColumnMajorStrides. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-11470) [C++] Overflow occurs on integer multiplications in Compute(Row|Column)MajorStrides
Kenta Murata created ARROW-11470: Summary: [C++] Overflow occurs on integer multiplications in Compute(Row|Column)MajorStrides Key: ARROW-11470 URL: https://issues.apache.org/jira/browse/ARROW-11470 Project: Apache Arrow Issue Type: Bug Components: C++ Reporter: Kenta Murata Assignee: Kenta Murata OSS-Fuzz reports the integer multiplication in ComputeRowMajorStrides function occurs overflow. https://oss-fuzz.com/testcase-detail/623225726408 The same issue exists in ComputeColumnMajorStrides. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-9642) [C++] Let MakeBuilder refer DictionaryType's index_type for deciding the starting bit width of the indices
[ https://issues.apache.org/jira/browse/ARROW-9642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenta Murata updated ARROW-9642: Summary: [C++] Let MakeBuilder refer DictionaryType's index_type for deciding the starting bit width of the indices (was: [C++] Let MakeBuilder refer DictionaryType's index_type to detect the starting bit width of the indices) > [C++] Let MakeBuilder refer DictionaryType's index_type for deciding the > starting bit width of the indices > -- > > Key: ARROW-9642 > URL: https://issues.apache.org/jira/browse/ARROW-9642 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Kenta Murata >Assignee: Kenta Murata >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9642) [C++] Let MakeBuilder refer DictionaryType's index_type to detect the starting bit width of the indices
Kenta Murata created ARROW-9642: --- Summary: [C++] Let MakeBuilder refer DictionaryType's index_type to detect the starting bit width of the indices Key: ARROW-9642 URL: https://issues.apache.org/jira/browse/ARROW-9642 Project: Apache Arrow Issue Type: Bug Components: C++ Reporter: Kenta Murata Assignee: Kenta Murata -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-9499) [C++] AdaptiveIntBuilder::AppendNull does not increment the null count
[ https://issues.apache.org/jira/browse/ARROW-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenta Murata updated ARROW-9499: Summary: [C++] AdaptiveIntBuilder::AppendNull does not increment the null count (was: [C++] AdaptiveIntBuilder::null_count does not return the null count) > [C++] AdaptiveIntBuilder::AppendNull does not increment the null count > -- > > Key: ARROW-9499 > URL: https://issues.apache.org/jira/browse/ARROW-9499 > Project: Apache Arrow > Issue Type: Bug >Reporter: Kenta Murata >Assignee: Kenta Murata >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9499) [C++] AdaptiveIntBuilder::null_count does not return the null count
Kenta Murata created ARROW-9499: --- Summary: [C++] AdaptiveIntBuilder::null_count does not return the null count Key: ARROW-9499 URL: https://issues.apache.org/jira/browse/ARROW-9499 Project: Apache Arrow Issue Type: Bug Reporter: Kenta Murata Assignee: Kenta Murata -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9454) [GLib] Add binding of some dictionary builders
Kenta Murata created ARROW-9454: --- Summary: [GLib] Add binding of some dictionary builders Key: ARROW-9454 URL: https://issues.apache.org/jira/browse/ARROW-9454 Project: Apache Arrow Issue Type: Improvement Components: GLib Reporter: Kenta Murata Assignee: Kenta Murata -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-9331) [C++] Improve the performance of Tensor-to-SparseTensor conversion
[ https://issues.apache.org/jira/browse/ARROW-9331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenta Murata updated ARROW-9331: Description: In ARROW-9156, I rewrote Tensor-to-SparseTensor converters to reduce the library size. There was a drawback of that change, that is slowing down the conversion. We need additional change to improve conversion speed. > [C++] Improve the performance of Tensor-to-SparseTensor conversion > -- > > Key: ARROW-9331 > URL: https://issues.apache.org/jira/browse/ARROW-9331 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Kenta Murata >Assignee: Kenta Murata >Priority: Major > > In ARROW-9156, I rewrote Tensor-to-SparseTensor converters to reduce the > library size. There was a drawback of that change, that is slowing down the > conversion. We need additional change to improve conversion speed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9331) [C++] Improve the performance of Tensor-to-SparseTensor conversion
Kenta Murata created ARROW-9331: --- Summary: [C++] Improve the performance of Tensor-to-SparseTensor conversion Key: ARROW-9331 URL: https://issues.apache.org/jira/browse/ARROW-9331 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Kenta Murata Assignee: Kenta Murata -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9156) [C++] Reducing the code size of the tensor module
Kenta Murata created ARROW-9156: --- Summary: [C++] Reducing the code size of the tensor module Key: ARROW-9156 URL: https://issues.apache.org/jira/browse/ARROW-9156 Project: Apache Arrow Issue Type: New Feature Components: C++ Reporter: Kenta Murata Assignee: Kenta Murata -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-8970) [C++] Reduce shared library / binary code size (umbrella issue)
[ https://issues.apache.org/jira/browse/ARROW-8970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17138062#comment-17138062 ] Kenta Murata commented on ARROW-8970: - [~wesm] There are some options to reduce code sizes related to tensor modules. (1) Reducing inline functions. (2) Reducing sparse tensor's index value types. I guess we needn't support 8-bit and 16-bit types for sparse tensor index. At least scipy's sparse matrix doesn't use 8bit and 16bit types for sparse matrix indices. We should investigate other libraries providing sparse tensors. (3) Dropping functions for converting among dense and sparse tensors. These functions aren't necessary to exchanging existing tensor data between systems. I will start (1) and (2) right now. If we don't need to provide conversion functions, I will also start (3). > [C++] Reduce shared library / binary code size (umbrella issue) > --- > > Key: ARROW-8970 > URL: https://issues.apache.org/jira/browse/ARROW-8970 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Major > > We're reaching a point where we may need to be careful about decisions that > increase code size: > * Instantiating too many templates for code that isn't performance sensitive, > or where some templates may do the same thing (e.g. Int32Type kernels may do > the same thing as a Date32Type kernel) > * Inlining functions that don't need to be inline > Code size tends to correlate also with compilation times, but not always. > I'll use this umbrella issue to organize issues related to reducing compiled > code size > At this moment (2020-05-27), here are the 25 largest object files in a -O2 > build > {code} > 524896src/arrow/CMakeFiles/arrow_objlib.dir/array/builder_dict.cc.o > 531920src/arrow/CMakeFiles/arrow_objlib.dir/filesystem/s3fs.cc.o > 552000src/arrow/CMakeFiles/arrow_objlib.dir/json/converter.cc.o > 575920src/arrow/CMakeFiles/arrow_objlib.dir/csv/converter.cc.o > 595112 > src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/scalar_cast_string.cc.o > 645728src/arrow/CMakeFiles/arrow_objlib.dir/type.cc.o > 683040 > src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/scalar_set_lookup.cc.o > 702232src/arrow/CMakeFiles/arrow_objlib.dir/ipc/reader.cc.o > 729912src/arrow/CMakeFiles/arrow_objlib.dir/tensor/coo_converter.cc.o > 752776src/arrow/CMakeFiles/arrow_objlib.dir/tensor/csc_converter.cc.o > 752776src/arrow/CMakeFiles/arrow_objlib.dir/tensor/csr_converter.cc.o > 877680src/arrow/CMakeFiles/arrow_objlib.dir/array/dict_internal.cc.o > 885624src/arrow/CMakeFiles/arrow_objlib.dir/builder.cc.o > 919072src/arrow/CMakeFiles/arrow_objlib.dir/scalar.cc.o > 941776src/arrow/CMakeFiles/arrow_objlib.dir/ipc/json_internal.cc.o > 1055248 src/arrow/CMakeFiles/arrow_objlib.dir/ipc/json_simple.cc.o > 1233304 > src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/scalar_compare.cc.o > 1265160 src/arrow/CMakeFiles/arrow_objlib.dir/sparse_tensor.cc.o > 1343480 src/arrow/CMakeFiles/arrow_objlib.dir/tensor/csf_converter.cc.o > 1346928 src/arrow/CMakeFiles/arrow_objlib.dir/array.cc.o > 1502568 > src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/vector_hash.cc.o > 1609760 > src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/scalar_cast_numeric.cc.o > 1794416 src/arrow/CMakeFiles/arrow_objlib.dir/array/diff.cc.o > 2759552 > src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/vector_filter.cc.o > 7609432 > src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/vector_take.cc.o > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-4221) [Format] Add canonical flag in COO sparse index
[ https://issues.apache.org/jira/browse/ARROW-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenta Murata reassigned ARROW-4221: --- Assignee: Kenta Murata > [Format] Add canonical flag in COO sparse index > --- > > Key: ARROW-4221 > URL: https://issues.apache.org/jira/browse/ARROW-4221 > Project: Apache Arrow > Issue Type: Improvement > Components: Format >Reporter: Kenta Murata >Assignee: Kenta Murata >Priority: Minor > Labels: sparse > Fix For: 1.0.0 > > > To support the integration with scipy.sparse.coo_matrix, it is necessary to > add a flag in SparseCOOIndex. This flag denotes whether elements in COO > sparse tensor is sorted lexicographically or not. -- This message was sent by Atlassian Jira (v8.3.4#803005)