[
https://issues.apache.org/jira/browse/ARROW-17984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17628471#comment-17628471
]
Ziheng Wang commented on ARROW-17984:
-
```
Thread 42 (Thread 0x7fd1d77fe700 (LWP 10
[
https://issues.apache.org/jira/browse/ARROW-17984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625816#comment-17625816
]
Ziheng Wang commented on ARROW-17984:
-
I have attached a crash file. You can unpack
[
https://issues.apache.org/jira/browse/ARROW-17984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ziheng Wang updated ARROW-17984:
Attachment: _usr_bin_python3.8.1000.crash
> pq.read_table doesn't seem to be thread safe
> ---
Ziheng Wang created ARROW-18105:
---
Summary: Arrow Flight SegFault
Key: ARROW-18105
URL: https://issues.apache.org/jira/browse/ARROW-18105
Project: Apache Arrow
Issue Type: Bug
Componen
[
https://issues.apache.org/jira/browse/ARROW-17984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17616016#comment-17616016
]
Ziheng Wang commented on ARROW-17984:
-
Unfortunately I cannot figure out how to get
[
https://issues.apache.org/jira/browse/ARROW-17984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ziheng Wang updated ARROW-17984:
Description:
Before PR: [https://github.com/apache/arrow/pull/13799] gets merged in master,
I am
[
https://issues.apache.org/jira/browse/ARROW-17984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ziheng Wang updated ARROW-17984:
Description:
Before PR: [https://github.com/apache/arrow/pull/13799] gets merged in master,
I am
Ziheng Wang created ARROW-17984:
---
Summary: pq.read_table doesn't seem to be thread safe
Key: ARROW-17984
URL: https://issues.apache.org/jira/browse/ARROW-17984
Project: Apache Arrow
Issue Type:
Ziheng Wang created ARROW-17529:
---
Summary: Clean up how the CSV reader handles the first buffer
Key: ARROW-17529
URL: https://issues.apache.org/jira/browse/ARROW-17529
Project: Apache Arrow
Iss
[
https://issues.apache.org/jira/browse/ARROW-17481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ziheng Wang updated ARROW-17481:
Description:
The current dataset reader for CSV is pretty slow on EC2 reading from S3.
EC2 instan
[
https://issues.apache.org/jira/browse/ARROW-17481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ziheng Wang updated ARROW-17481:
Summary: [C++] [Python] Major performance improvements to CSV reading from
S3 (was: Major perform
Ziheng Wang created ARROW-17481:
---
Summary: Major performance improvements to CSV reading from S3
Key: ARROW-17481
URL: https://issues.apache.org/jira/browse/ARROW-17481
Project: Apache Arrow
Is
[
https://issues.apache.org/jira/browse/ARROW-17380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ziheng Wang updated ARROW-17380:
Summary: [C++] [Python] Tag record batches with start_byte and end_byte
infromation (was: Tag rec
Ziheng Wang created ARROW-17380:
---
Summary: Tag record batches with start_byte and end_byte
infromation
Key: ARROW-17380
URL: https://issues.apache.org/jira/browse/ARROW-17380
Project: Apache Arrow
[
https://issues.apache.org/jira/browse/ARROW-17313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17576926#comment-17576926
]
Ziheng Wang commented on ARROW-17313:
-
There is no physical way you can do this with
[
https://issues.apache.org/jira/browse/ARROW-17313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17576079#comment-17576079
]
Ziheng Wang commented on ARROW-17313:
-
Ideally we update the Dataset Scanner to be a
[
https://issues.apache.org/jira/browse/ARROW-17313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ziheng Wang updated ARROW-17313:
Description:
Sometimes it's desirable to just read a portion of a CSV. The best way to do
that is
[
https://issues.apache.org/jira/browse/ARROW-17313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ziheng Wang updated ARROW-17313:
Description:
Sometimes it's desirable to just read a portion of a CSV. The best way to do
that is
[
https://issues.apache.org/jira/browse/ARROW-17313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17575967#comment-17575967
]
Ziheng Wang commented on ARROW-17313:
-
Also this will not support compressed formats
[
https://issues.apache.org/jira/browse/ARROW-17313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17575916#comment-17575916
]
Ziheng Wang commented on ARROW-17313:
-
Ah I meant what we should do about the linbre
[
https://issues.apache.org/jira/browse/ARROW-17313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17575862#comment-17575862
]
Ziheng Wang commented on ARROW-17313:
-
[~apitrou] can you elaborate a bit on your wa
[
https://issues.apache.org/jira/browse/ARROW-17313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17575545#comment-17575545
]
Ziheng Wang commented on ARROW-17313:
-
I think if you support things like this, then
[
https://issues.apache.org/jira/browse/ARROW-17313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17575489#comment-17575489
]
Ziheng Wang commented on ARROW-17313:
-
My proposal is that we will allow additional
[
https://issues.apache.org/jira/browse/ARROW-17313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17575489#comment-17575489
]
Ziheng Wang edited comment on ARROW-17313 at 8/5/22 12:02 AM:
Ziheng Wang created ARROW-17313:
---
Summary: Add Byte Range to CSV Reader ReadOptions
Key: ARROW-17313
URL: https://issues.apache.org/jira/browse/ARROW-17313
Project: Apache Arrow
Issue Type: Imp
[
https://issues.apache.org/jira/browse/ARROW-17299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ziheng Wang updated ARROW-17299:
Summary: [C++] [Python] Expose the Scanner kDefaultBatchReadahead and
kDefaultFragmentReadahead pa
Ziheng Wang created ARROW-17299:
---
Summary: Expose the Scanner kDefaultBatchReadahead and
kDefaultFragmentReadahead parameters
Key: ARROW-17299
URL: https://issues.apache.org/jira/browse/ARROW-17299
Proj
[
https://issues.apache.org/jira/browse/ARROW-14635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ziheng Wang reassigned ARROW-14635:
---
Assignee: Ziheng Wang
> [C++][Dataset] Devise a mechanism to limit the total "system ram" (
[
https://issues.apache.org/jira/browse/ARROW-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ziheng Wang closed ARROW-17114.
---
Resolution: Duplicate
> [Python][C++] O_DIRECT write support
>
[
https://issues.apache.org/jira/browse/ARROW-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ziheng Wang reassigned ARROW-17114:
---
Assignee: Ziheng Wang
> [Python][C++] O_DIRECT write support
> ---
Ziheng Wang created ARROW-17114:
---
Summary: [Python][C++] O_DIRECT write support
Key: ARROW-17114
URL: https://issues.apache.org/jira/browse/ARROW-17114
Project: Apache Arrow
Issue Type: New Fe
[
https://issues.apache.org/jira/browse/ARROW-16521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ziheng Wang reassigned ARROW-16521:
---
Assignee: Ziheng Wang
> [C++][R] Configure curl timeout policy for S3
> ---
[
https://issues.apache.org/jira/browse/ARROW-16037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17513609#comment-17513609
]
Ziheng Wang commented on ARROW-16037:
-
I am on ubuntu
pa.default_memory_pool().backe
[
https://issues.apache.org/jira/browse/ARROW-16037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17513573#comment-17513573
]
Ziheng Wang commented on ARROW-16037:
-
Does not help.
mem usage 179580928 8000
[
https://issues.apache.org/jira/browse/ARROW-16037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ziheng Wang updated ARROW-16037:
Priority: Blocker (was: Major)
> Possible memory leak in compute.take
> -
Ziheng Wang created ARROW-16037:
---
Summary: Possible memory leak in compute.take
Key: ARROW-16037
URL: https://issues.apache.org/jira/browse/ARROW-16037
Project: Apache Arrow
Issue Type: Bug
36 matches
Mail list logo