from:"madhukara phatak $JIRA$"

[jira] [Updated] (SPARK-20723) Random Forest Classifier should expose intermediateRDDStorageLevel similar to ALS

2017-05-13 Thread madhukara phatak (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

madhukara phatak updated SPARK-20723:
-
Description: 
Currently Random Forest implementation cache as the intermediatery data using 
*MEMORY_AND_DISK* storage level. This creates issues in low memory scenarios. 
So we should expose an expert param *intermediateStorageLevel* which allows 
user to customise the storage level. This is similar to als options like 
specified in below jira

https://issues.apache.org/jira/browse/SPARK-14412

  was:
Currently Random Forest implementation cache as the intermediatery data using 
*MEMORY_AND_DISK* storage level. This creates issues in low memory scenarios. 
So we should expose an expert param *intermediateRDDStorageLevel* which allows 
user to customise the storage level. This is similar to als options like 
specified in below jira

https://issues.apache.org/jira/browse/SPARK-14412


> Random Forest Classifier should expose intermediateRDDStorageLevel similar to 
> ALS
> -
>
> Key: SPARK-20723
> URL: https://issues.apache.org/jira/browse/SPARK-20723
> Project: Spark
>  Issue Type: New Feature
>  Components: ML
>Affects Versions: 2.3.0
>Reporter: madhukara phatak
>Priority: Minor
>
> Currently Random Forest implementation cache as the intermediatery data using 
> *MEMORY_AND_DISK* storage level. This creates issues in low memory scenarios. 
> So we should expose an expert param *intermediateStorageLevel* which allows 
> user to customise the storage level. This is similar to als options like 
> specified in below jira
> https://issues.apache.org/jira/browse/SPARK-14412



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-20723) Random Forest Classifier should expose intermediateRDDStorageLevel similar to ALS

2017-05-12 Thread madhukara phatak (JIRA)

madhukara phatak created SPARK-20723:


 Summary: Random Forest Classifier should expose 
intermediateRDDStorageLevel similar to ALS
 Key: SPARK-20723
 URL: https://issues.apache.org/jira/browse/SPARK-20723
 Project: Spark
  Issue Type: New Feature
  Components: ML
Affects Versions: 2.3.0
Reporter: madhukara phatak
Priority: Minor


Currently Random Forest implementation cache as the intermediatery data using 
*MEMORY_AND_DISK* storage level. This creates issues in low memory scenarios. 
So we should expose an expert param *intermediateRDDStorageLevel* which allows 
user to customise the storage level. This is similar to als options like 
specified in below jira

https://issues.apache.org/jira/browse/SPARK-14412



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-7084) Improve the saveAsTable documentation

2015-04-23 Thread madhukara phatak (JIRA)

[
https://issues.apache.org/jira/browse/SPARK-7084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

madhukara phatak updated SPARK-7084:

Description: The documentation of saveTable is little bit confusing. By
name of the API it sounds like it creates a hive table which can be accessed
from hive. But it's not the case as discussed here
[https://www.mailarchive.com/u...@spark.apache.org/msg26902.html] . This issue
is to improve the documentation to reflect the same. (was: The documentation
of saveTable is little bit confusing. By name of the API it sounds like it
creates a hive table which can be accessed from hive. But it's not the case as
discussed here[https://www.mailarchive.com/u...@spark.apache.org/msg26902.html]
. This issue is to improve the documentation to reflect the same.)

Improve the saveAsTable documentation
-

Key: SPARK-7084
URL: https://issues.apache.org/jira/browse/SPARK-7084
Project: Spark
Issue Type: Documentation
Components: SQL
Affects Versions: 1.3.1
Reporter: madhukara phatak
Priority: Minor

The documentation of saveTable is little bit confusing. By name of the API it
sounds like it creates a hive table which can be accessed from hive. But it's
not the case as discussed here
[https://www.mailarchive.com/u...@spark.apache.org/msg26902.html] . This
issue is to improve the documentation to reflect the same.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-7084) Improve the saveAsTable documentation

2015-04-23 Thread madhukara phatak (JIRA)

madhukara phatak created SPARK-7084:
---

 Summary: Improve the saveAsTable documentation
 Key: SPARK-7084
 URL: https://issues.apache.org/jira/browse/SPARK-7084
 Project: Spark
  Issue Type: Documentation
  Components: SQL
Affects Versions: 1.3.1
Reporter: madhukara phatak
Priority: Minor


The documentation of saveTable is little bit confusing. By name of the API it 
sounds like it creates a hive table which can be accessed from hive. But it's 
not the case as discussed 
here[https://www.mailarchive.com/u...@spark.apache.org/msg26902.html] . This 
issue is to improve the documentation to reflect the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-4414) SparkContext.wholeTextFiles Doesn't work with S3 Buckets

2015-03-16 Thread madhukara phatak (JIRA)

[
https://issues.apache.org/jira/browse/SPARK-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14362856#comment-14362856
]

madhukara phatak commented on SPARK-4414:
-

Hi,
Just ran your example on my local machine. Here is the gist
https://gist.github.com/phatak-dev/e75d5d0d773b857903c1. It works fine for me.
Can you test the same?

SparkContext.wholeTextFiles Doesn't work with S3 Buckets

Key: SPARK-4414
URL: https://issues.apache.org/jira/browse/SPARK-4414
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 1.1.0, 1.2.0
Reporter: Pedro Rodriguez
Priority: Critical

SparkContext.wholeTextFiles does not read files which SparkContext.textFile
can read. Below are general steps to reproduce, my specific case is following
that on a git repo.
Steps to reproduce.
1. Create Amazon S3 bucket, make public with multiple files
2. Attempt to read bucket with
sc.wholeTextFiles(s3n://mybucket/myfile.txt)
3. Spark returns the following error, even if the file exists.
Exception in thread main java.io.FileNotFoundException: File does not
exist: /myfile.txt
at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:517)
at
org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat$OneFileInfo.init(CombineFileInputFormat.java:489)
4. Change the call to
sc.textFile(s3n://mybucket/myfile.txt)
and there is no error message, the application should run fine.
There is a question on StackOverflow as well on this:
http://stackoverflow.com/questions/26258458/sparkcontext-wholetextfiles-java-io-filenotfoundexception-file-does-not-exist
This is link to repo/lines of code. The uncommented call doesn't work, the
commented call works as expected:
https://github.com/EntilZha/nips-lda-spark/blob/45f5ad1e2646609ef9d295a0954fbefe84111d8a/src/main/scala/NipsLda.scala#L13-L19
It would be easy to use textFile with a multifile argument, but this should
work correctly for s3 bucket files as well.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-20723) Random Forest Classifier should expose intermediateRDDStorageLevel similar to ALS

[jira] [Created] (SPARK-20723) Random Forest Classifier should expose intermediateRDDStorageLevel similar to ALS

[jira] [Updated] (SPARK-7084) Improve the saveAsTable documentation

[jira] [Created] (SPARK-7084) Improve the saveAsTable documentation

[jira] [Commented] (SPARK-4414) SparkContext.wholeTextFiles Doesn't work with S3 Buckets

5 matches

Site Navigation

Mail list logo

Footer information