[jira] [Commented] (SPARK-6832) Handle partial reads in SparkR JVM to worker communication

2016-08-21 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429895#comment-15429895
 ] 

Apache Spark commented on SPARK-6832:
-

User 'krishnakalyan3' has created a pull request for this issue:
https://github.com/apache/spark/pull/14741

> Handle partial reads in SparkR JVM to worker communication
> --
>
> Key: SPARK-6832
> URL: https://issues.apache.org/jira/browse/SPARK-6832
> Project: Spark
>  Issue Type: Improvement
>  Components: SparkR
>Reporter: Shivaram Venkataraman
>Priority: Minor
>
> After we move to use socket between R worker and JVM, it's possible that 
> readBin() in R will return partial results (for example, interrupted by 
> signal).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6832) Handle partial reads in SparkR JVM to worker communication

2016-08-18 Thread Shivaram Venkataraman (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15426798#comment-15426798
 ] 

Shivaram Venkataraman commented on SPARK-6832:
--

I think we can add a new method `readBinFully` and then replace calls to 
`readBin` with that method.

Regarding simulating this -- I think you could try to manually send a signal 
(using something like kill -s SIGCHLD) to an R process while it is reading a 
large amount of data using readBin. 

> Handle partial reads in SparkR JVM to worker communication
> --
>
> Key: SPARK-6832
> URL: https://issues.apache.org/jira/browse/SPARK-6832
> Project: Spark
>  Issue Type: Improvement
>  Components: SparkR
>Reporter: Shivaram Venkataraman
>Priority: Minor
>
> After we move to use socket between R worker and JVM, it's possible that 
> readBin() in R will return partial results (for example, interrupted by 
> signal).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6832) Handle partial reads in SparkR JVM to worker communication

2016-08-18 Thread Krishna Kalyan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15426012#comment-15426012
 ] 

Krishna Kalyan commented on SPARK-6832:
---

[~shivaram],[~davies],
I see there are 7 occurrences of readBin. From what I understand I need to wrap 
them under a retry method. Is this understanding correct?. 
Another question I had was how do I test partial reads / simulate this on my 
local system.

Thanks,
Krishna

> Handle partial reads in SparkR JVM to worker communication
> --
>
> Key: SPARK-6832
> URL: https://issues.apache.org/jira/browse/SPARK-6832
> Project: Spark
>  Issue Type: Improvement
>  Components: SparkR
>Reporter: Shivaram Venkataraman
>Priority: Minor
>
> After we move to use socket between R worker and JVM, it's possible that 
> readBin() in R will return partial results (for example, interrupted by 
> signal).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6832) Handle partial reads in SparkR JVM to worker communication

2016-08-09 Thread Shivaram Venkataraman (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414262#comment-15414262
 ] 

Shivaram Venkataraman commented on SPARK-6832:
--

[~KrishnaKalyan3] Thanks for looking at this issue. The problem we ran into 
while we opened the bug is discussed in 
https://github.com/amplab-extras/SparkR-pkg/pull/193#issuecomment-78144164 and 
the comments following that.

I think the goal here was to add a retry mechanism around readBin that would be 
resilient against signals.

> Handle partial reads in SparkR JVM to worker communication
> --
>
> Key: SPARK-6832
> URL: https://issues.apache.org/jira/browse/SPARK-6832
> Project: Spark
>  Issue Type: Improvement
>  Components: SparkR
>Reporter: Shivaram Venkataraman
>Priority: Minor
>
> After we move to use socket between R worker and JVM, it's possible that 
> readBin() in R will return partial results (for example, interrupted by 
> signal).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6832) Handle partial reads in SparkR JVM to worker communication

2016-08-09 Thread Krishna Kalyan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413972#comment-15413972
 ] 

Krishna Kalyan commented on SPARK-6832:
---

[~shivaram] 
I see that changes need to made in `R/pkg/R/deserialize.R` and specifically in 
the `readString` function. However I dont understand how to go about making 
changes to enable partial reads. I have gone through 
`https://stat.ethz.ch/R-manual/R-devel/library/base/html/readBin.html`. 

Thanks,
Krishna

> Handle partial reads in SparkR JVM to worker communication
> --
>
> Key: SPARK-6832
> URL: https://issues.apache.org/jira/browse/SPARK-6832
> Project: Spark
>  Issue Type: Improvement
>  Components: SparkR
>Reporter: Shivaram Venkataraman
>Priority: Minor
>
> After we move to use socket between R worker and JVM, it's possible that 
> readBin() in R will return partial results (for example, interrupted by 
> signal).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org