[jira] [Updated] (ARROW-15729) [R] Reading large files randomly freezes

2022-02-24 Thread Joris Van den Bossche (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van den Bossche updated ARROW-15729:
--
Fix Version/s: (was: 6.0.1)

> [R] Reading large files randomly freezes
> 
>
> Key: ARROW-15729
> URL: https://issues.apache.org/jira/browse/ARROW-15729
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Reporter: Christian
>Priority: Critical
>
> Hi -
> I recently upgraded to Arrow 6.0.1 and am using it in R.
> Whenever reading a large file (~10gb) in Windows it randomly freezes 
> sometimes. I can see the memory being allocated in the first 10-20 seconds, 
> but then nothing happens and R just doesn't respond (the R process becomes 
> idle too).
> I'm using the option options(arrow.use_threads=FALSE).
> I didn't have this issue with the previous version (0.15.1) I was using. And 
> the file reads fine under Linux.
> I would post a reproducible example but it happens randomly. I even thought I 
> would just read large files in pieces by first getting all the distinct 
> sections of a specific column (with compute>collect) but that hangs too.
> Any ideas would be appreciated.
> *Edit*
> Not sure if it makes sense to anyone but after a few tries it seems that the 
> issue only happens in Rstudio. In the R console it loads it fine. All I'm 
> executing is the below.
> options(arrow.use_threads=FALSE)
> aa <- arrow::read_arrow('.../file.arrow5')
> One thing I want to point out that the underlying Rscript process under 
> Rstudio seems to definitely use more than one core when executing the above.
> *Edit2*
> Using arrow::set_cpu_count(1) seems to solve the issue.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (ARROW-15729) [R] Reading large files randomly freezes

2022-02-18 Thread Christian (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian updated ARROW-15729:
--
Description: 
Hi -

I recently upgraded to Arrow 6.0.1 and am using it in R.

Whenever reading a large file (~10gb) in Windows it randomly freezes sometimes. 
I can see the memory being allocated in the first 10-20 seconds, but then 
nothing happens and R just doesn't respond (the R process becomes idle too).

I'm using the option options(arrow.use_threads=FALSE).

I didn't have this issue with the previous version (0.15.1) I was using. And 
the file reads fine under Linux.

I would post a reproducible example but it happens randomly. I even thought I 
would just read large files in pieces by first getting all the distinct 
sections of a specific column (with compute>collect) but that hangs too.

Any ideas would be appreciated.

*Edit*

Not sure if it makes sense to anyone but after a few tries it seems that the 
issue only happens in Rstudio. In the R console it loads it fine. All I'm 
executing is the below.

options(arrow.use_threads=FALSE)
aa <- arrow::read_arrow('.../file.arrow5')

One thing I want to point out that the underlying Rscript process under Rstudio 
seems to definitely use more than one core when executing the above.

*Edit2*

Using arrow::set_cpu_count(1) seems to solve the issue.

 

  was:
Hi -

I recently upgraded to Arrow 6.0.1 and am using it in R.

Whenever reading a large file (~10gb) in Windows it randomly freezes sometimes. 
I can see the memory being allocated in the first 10-20 seconds, but then 
nothing happens and R just doesn't respond (the R process becomes idle too).

I'm using the option options(arrow.use_threads=FALSE).

I didn't have this issue with the previous version (0.15.1) I was using. And 
the file reads fine under Linux.

I would post a reproducible example but it happens randomly. I even thought I 
would just read large files in pieces by first getting all the distinct 
sections of a specific column (with compute>collect) but that hangs too.

Any ideas would be appreciated.

*Edit*

Not sure if it makes sense to anyone but after a few tries it seems that the 
issue only happens in Rstudio. In the R console it loads it fine. All I'm 
executing is the below.

options(arrow.use_threads=FALSE)
aa <- arrow::read_arrow('.../file.arrow5')

One thing I want to point out that the underlying Rscript process under Rstudio 
seems to definitely use more than one core when executing the above.


> [R] Reading large files randomly freezes
> 
>
> Key: ARROW-15729
> URL: https://issues.apache.org/jira/browse/ARROW-15729
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Reporter: Christian
>Priority: Critical
> Fix For: 6.0.1
>
>
> Hi -
> I recently upgraded to Arrow 6.0.1 and am using it in R.
> Whenever reading a large file (~10gb) in Windows it randomly freezes 
> sometimes. I can see the memory being allocated in the first 10-20 seconds, 
> but then nothing happens and R just doesn't respond (the R process becomes 
> idle too).
> I'm using the option options(arrow.use_threads=FALSE).
> I didn't have this issue with the previous version (0.15.1) I was using. And 
> the file reads fine under Linux.
> I would post a reproducible example but it happens randomly. I even thought I 
> would just read large files in pieces by first getting all the distinct 
> sections of a specific column (with compute>collect) but that hangs too.
> Any ideas would be appreciated.
> *Edit*
> Not sure if it makes sense to anyone but after a few tries it seems that the 
> issue only happens in Rstudio. In the R console it loads it fine. All I'm 
> executing is the below.
> options(arrow.use_threads=FALSE)
> aa <- arrow::read_arrow('.../file.arrow5')
> One thing I want to point out that the underlying Rscript process under 
> Rstudio seems to definitely use more than one core when executing the above.
> *Edit2*
> Using arrow::set_cpu_count(1) seems to solve the issue.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (ARROW-15729) [R] Reading large files randomly freezes

2022-02-18 Thread Nicola Crane (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicola Crane updated ARROW-15729:
-
Summary: [R] Reading large files randomly freezes  (was: Reading large 
files randomly freezes)

> [R] Reading large files randomly freezes
> 
>
> Key: ARROW-15729
> URL: https://issues.apache.org/jira/browse/ARROW-15729
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Reporter: Christian
>Priority: Critical
> Fix For: 6.0.1
>
>
> Hi -
> I recently upgraded to Arrow 6.0.1 and am using it in R.
> Whenever reading a large file (~10gb) in Windows it randomly freezes 
> sometimes. I can see the memory being allocated in the first 10-20 seconds, 
> but then nothing happens and R just doesn't respond (the R process becomes 
> idle too).
> I'm using the option options(arrow.use_threads=FALSE).
> I didn't have this issue with the previous version (0.15.1) I was using. And 
> the file reads fine under Linux.
> I would post a reproducible example but it happens randomly. I even thought I 
> would just read large files in pieces by first getting all the distinct 
> sections of a specific column (with compute>collect) but that hangs too.
> Any ideas would be appreciated.
> *Edit*
> Not sure if it makes sense to anyone but after a few tries it seems that the 
> issue only happens in Rstudio. In the R console it loads it fine. All I'm 
> executing is the below.
> options(arrow.use_threads=FALSE)
> aa <- arrow::read_arrow('.../file.arrow5')
> One thing I want to point out that the underlying Rscript process under 
> Rstudio seems to definitely use more than one core when executing the above.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)