[jira] [Commented] (SPARK-10206) Specify schema during IsotonicRegression.save to avoid reflection

2015-08-25 Thread Vinod KC (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712462#comment-14712462
 ] 

Vinod KC commented on SPARK-10206:
--

I'm working on this

 Specify schema during IsotonicRegression.save to avoid reflection
 -

 Key: SPARK-10206
 URL: https://issues.apache.org/jira/browse/SPARK-10206
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Reporter: Feynman Liang
Priority: Minor

 [IsotonicRegression.save|https://github.com/apache/spark/blob/f5b028ed2f1ad6de43c8b50ebf480e1b6c047035/mllib/src/main/scala/org/apache/spark/mllib/regression/IsotonicRegression.scala#L184]
  currently infers a schema from a case class when the schema is known and 
 should be manually provided.
 See parent JIRA for rationale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-10204) Specify schema during NaiveBayes.save to avoid reflection

2015-08-25 Thread Vinod KC (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-10204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712459#comment-14712459
 ] 

Vinod KC commented on SPARK-10204:
--

I'm working on this

 Specify schema during NaiveBayes.save to avoid reflection
 -

 Key: SPARK-10204
 URL: https://issues.apache.org/jira/browse/SPARK-10204
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Reporter: Feynman Liang
Priority: Minor

 [NaiveBayes.save|https://github.com/apache/spark/blob/f5b028ed2f1ad6de43c8b50ebf480e1b6c047035/mllib/src/main/scala/org/apache/spark/mllib/classification/NaiveBayes.scala#L181]
  currently infers a schema from a case class when the schema is known and 
 should be manually provided.
 See parent JIRA for rationale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-10205) Specify schema during PowerIterationClustering.save to avoid reflection

2015-08-25 Thread Vinod KC (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-10205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712461#comment-14712461
 ] 

Vinod KC commented on SPARK-10205:
--

I'm working on this

 Specify schema during PowerIterationClustering.save to avoid reflection
 ---

 Key: SPARK-10205
 URL: https://issues.apache.org/jira/browse/SPARK-10205
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Reporter: Feynman Liang
Priority: Minor

 [PowerIterationClustering.save|https://github.com/apache/spark/blob/f5b028ed2f1ad6de43c8b50ebf480e1b6c047035/mllib/src/main/scala/org/apache/spark/mllib/clustering/PowerIterationClustering.scala#L82]
  currently infers a schema from a case class when the schema is known and 
 should be manually provided.
 See parent JIRA for rationale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-10211) Specify schema during MatrixFactorizationModel.save to avoid reflection

2015-08-25 Thread Vinod KC (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-10211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712466#comment-14712466
 ] 

Vinod KC commented on SPARK-10211:
--

I'm working on this

 Specify schema during MatrixFactorizationModel.save to avoid reflection
 ---

 Key: SPARK-10211
 URL: https://issues.apache.org/jira/browse/SPARK-10211
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Reporter: Feynman Liang
Priority: Minor

 [MatrixFactorizationModel.save|https://github.com/apache/spark/blob/f5b028ed2f1ad6de43c8b50ebf480e1b6c047035/mllib/src/main/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModel.scala#L361]
  currently infers a schema from a RDD of tuples when the schema is known and 
 should be manually provided.
 See parent JIRA for rationale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-10208) Specify schema during LocalLDAModel.save to avoid reflection

2015-08-25 Thread Vinod KC (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-10208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712464#comment-14712464
 ] 

Vinod KC commented on SPARK-10208:
--

I'm working on this

 Specify schema during LocalLDAModel.save to avoid reflection
 

 Key: SPARK-10208
 URL: https://issues.apache.org/jira/browse/SPARK-10208
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Reporter: Feynman Liang
Priority: Minor

 [LocalLDAModel.save|https://github.com/apache/spark/blob/f5b028ed2f1ad6de43c8b50ebf480e1b6c047035/mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala#L389]
  currently infers a schema from a case class when the schema is known and 
 should be manually provided.
 See parent JIRA for rationale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-10209) Specify schema during DistributedLDAModel.save to avoid reflection

2015-08-25 Thread Vinod KC (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-10209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712465#comment-14712465
 ] 

Vinod KC commented on SPARK-10209:
--

I'm working on this

 Specify schema during DistributedLDAModel.save to avoid reflection
 --

 Key: SPARK-10209
 URL: https://issues.apache.org/jira/browse/SPARK-10209
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Reporter: Feynman Liang
Priority: Minor

 [DistributedLDAModel.save|https://github.com/apache/spark/blob/f5b028ed2f1ad6de43c8b50ebf480e1b6c047035/mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala#L783]
  currently infers a schema from a case class when the schema is known and 
 should be manually provided.
 See parent JIRA for rationale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-10207) Specify schema during Word2Vec.save to avoid reflection

2015-08-25 Thread Vinod KC (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712463#comment-14712463
 ] 

Vinod KC commented on SPARK-10207:
--

I'm working on this

 Specify schema during Word2Vec.save to avoid reflection
 ---

 Key: SPARK-10207
 URL: https://issues.apache.org/jira/browse/SPARK-10207
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Reporter: Feynman Liang
Priority: Minor

 [Word2Vec.save|https://github.com/apache/spark/blob/7cfc0750e14f2c1b3847e4720cc02150253525a9/mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala#L615]
  currently infers a schema from a case class when the schema is known and 
 should be manually provided.
 See parent JIRA for rationale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-10212) Specify schema during TreeEnsembleModel.save to avoid reflection

2015-08-25 Thread Vinod KC (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-10212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712467#comment-14712467
 ] 

Vinod KC commented on SPARK-10212:
--

I'm working on this

 Specify schema during TreeEnsembleModel.save to avoid reflection
 

 Key: SPARK-10212
 URL: https://issues.apache.org/jira/browse/SPARK-10212
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Reporter: Feynman Liang
Priority: Minor

 [TreeEnsembleModel.save|https://github.com/apache/spark/blob/f5b028ed2f1ad6de43c8b50ebf480e1b6c047035/mllib/src/main/scala/org/apache/spark/mllib/tree/model/treeEnsembleModels.scala#L451]
  currently infers a schema from a RDD of {{NodeData}} case classes when the 
 schema is known and should be manually provided.
 See parent JIRA for rationale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-10213) Specify schema during DecisionTreeModel.save to avoid reflection

2015-08-25 Thread Vinod KC (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-10213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712469#comment-14712469
 ] 

Vinod KC commented on SPARK-10213:
--

I'm working on this

 Specify schema during DecisionTreeModel.save to avoid reflection
 

 Key: SPARK-10213
 URL: https://issues.apache.org/jira/browse/SPARK-10213
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Reporter: Feynman Liang
Priority: Minor

 [DecisionTreeModel.save|https://github.com/apache/spark/blob/f5b028ed2f1ad6de43c8b50ebf480e1b6c047035/mllib/src/main/scala/org/apache/spark/mllib/tree/model/DecisionTreeModel.scala#L238]
  currently infers a schema from a {{NodeData}} case class when the schema is 
 known and should be manually provided.
 See parent JIRA for rationale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-10199) Avoid using reflections for parquet model save

2015-08-25 Thread Vinod KC (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-10199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712496#comment-14712496
 ] 

Vinod KC commented on SPARK-10199:
--

Sure, I'll group all the changes into single PR

 Avoid using reflections for parquet model save
 --

 Key: SPARK-10199
 URL: https://issues.apache.org/jira/browse/SPARK-10199
 Project: Spark
  Issue Type: Improvement
  Components: ML, MLlib
Reporter: Feynman Liang
Priority: Minor

 These items are not high priority since the overhead writing to Parquest is 
 much greater than for runtime reflections.
 Multiple model save/load in MLlib use case classes to infer a schema for the 
 data frame saved to Parquet. However, inferring a schema from case classes or 
 tuples uses [runtime 
 reflection|https://github.com/apache/spark/blob/d7b4c095271c36fcc7f9ded267ecf5ec66fac803/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala#L361]
  which is unnecessary since the types are already known at the time `save` is 
 called.
 It would be better to just specify the schema for the data frame directly 
 using {{sqlContext.createDataFrame(dataRDD, schema)}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-10201) Specify schema during GaussianMixtureModel.save to avoid reflection

2015-08-25 Thread Vinod KC (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712455#comment-14712455
 ] 

Vinod KC commented on SPARK-10201:
--

I'm working on this

 Specify schema during GaussianMixtureModel.save to avoid reflection
 ---

 Key: SPARK-10201
 URL: https://issues.apache.org/jira/browse/SPARK-10201
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Reporter: Feynman Liang
Priority: Minor

 [GaussianMixtureModel.save|https://github.com/apache/spark/blob/f5b028ed2f1ad6de43c8b50ebf480e1b6c047035/mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixtureModel.scala#L140]
  currently infers a schema from a case class when the schema is known and 
 should be manually provided.
 See parent JIRA for rationale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-10200) Specify schema during GLMRegressionModel.save to avoid reflection

2015-08-25 Thread Vinod KC (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-10200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712454#comment-14712454
 ] 

Vinod KC commented on SPARK-10200:
--

I'm working on this

 Specify schema during GLMRegressionModel.save to avoid reflection
 -

 Key: SPARK-10200
 URL: https://issues.apache.org/jira/browse/SPARK-10200
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Reporter: Feynman Liang
Priority: Minor

 [GLMRegressionModel.save|https://github.com/apache/spark/blob/d7b4c095271c36fcc7f9ded267ecf5ec66fac803/mllib/src/main/scala/org/apache/spark/mllib/regression/impl/GLMRegressionModel.scala#L44]
  currently infers a schema from a case class when the schema is known and 
 should be manually provided.
 See parent JIRA for rationale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8919) Add @since tags to mllib.recommendation

2015-07-28 Thread Vinod KC (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14644572#comment-14644572
 ] 

Vinod KC commented on SPARK-8919:
-

Please assign this issue to me

 Add @since tags to mllib.recommendation
 ---

 Key: SPARK-8919
 URL: https://issues.apache.org/jira/browse/SPARK-8919
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, MLlib
Reporter: Xiangrui Meng
Priority: Minor
  Labels: starter
 Fix For: 1.5.0

   Original Estimate: 1h
  Remaining Estimate: 1h





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-8787) Change the parameter order of @deprecated in package object sql

2015-07-02 Thread Vinod KC (JIRA)
Vinod KC created SPARK-8787:
---

 Summary: Change the parameter  order of @deprecated in package 
object sql
 Key: SPARK-8787
 URL: https://issues.apache.org/jira/browse/SPARK-8787
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Reporter: Vinod KC
Priority: Trivial


Parameter order of @deprecated annotation  in package object sql is wrong 
deprecated(1.3.0, use DataFrame) .

This has to be changed to  deprecated(use DataFrame, 1.3.0)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-8628) Race condition in AbstractSparkSQLParser.parse

2015-06-30 Thread Vinod KC (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-8628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609477#comment-14609477
 ] 

Vinod KC commented on SPARK-8628:
-

Can you please assign this to me

 Race condition in AbstractSparkSQLParser.parse
 --

 Key: SPARK-8628
 URL: https://issues.apache.org/jira/browse/SPARK-8628
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.3.0, 1.3.1, 1.4.0
Reporter: Santiago M. Mola
Priority: Critical
  Labels: regression
 Fix For: 1.5.0, 1.4.2


 SPARK-5009 introduced the following code in AbstractSparkSQLParser:
 {code}
 def parse(input: String): LogicalPlan = {
 // Initialize the Keywords.
 lexical.initialize(reservedWords)
 phrase(start)(new lexical.Scanner(input)) match {
   case Success(plan, _) = plan
   case failureOrError = sys.error(failureOrError.toString)
 }
   }
 {code}
 The corresponding initialize method in SqlLexical is not thread-safe:
 {code}
   /* This is a work around to support the lazy setting */
   def initialize(keywords: Seq[String]): Unit = {
 reserved.clear()
 reserved ++= keywords
   }
 {code}
 I'm hitting this when parsing multiple SQL queries concurrently. When one 
 query parsing starts, it empties the reserved keyword list, then a 
 race-condition occurs and other queries fail to parse because they recognize 
 keywords as identifiers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-7489) Spark shell crashes when compiled with scala 2.11 and SPARK_PREPEND_CLASSES=true

2015-05-08 Thread Vinod KC (JIRA)
Vinod KC created SPARK-7489:
---

 Summary: Spark shell crashes when compiled with scala 2.11 and 
SPARK_PREPEND_CLASSES=true
 Key: SPARK-7489
 URL: https://issues.apache.org/jira/browse/SPARK-7489
 Project: Spark
  Issue Type: Bug
  Components: Spark Shell
Reporter: Vinod KC


Steps followed
export SPARK_PREPEND_CLASSES=true
dev/change-version-to-2.11.sh
 sbt/sbt -Pyarn -Phadoop-2.4 -Dscala-2.11 -DskipTests clean assembly

bin/spark-shell

15/05/08 22:31:35 INFO Main: Created spark context..
Spark context available as sc.
java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf
  at java.lang.Class.getDeclaredConstructors0(Native Method)
  at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671)
  at java.lang.Class.getConstructor0(Class.java:3075)
  at java.lang.Class.getConstructor(Class.java:1825)
  at org.apache.spark.repl.Main$.createSQLContext(Main.scala:86)
  ... 45 elided
Caused by: java.lang.ClassNotFoundException: 
org.apache.hadoop.hive.conf.HiveConf
  at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  ... 50 more
console:11: error: not found: value sqlContext
   import sqlContext.implicits._
  ^
console:11: error: not found: value sqlContext
   import sqlContext.sql

There is a similar Resolved JIRA issue  -SPARK-7470 and a PR 
https://github.com/apache/spark/pull/5997 , which handled same  issue  only in 
scala 2.10



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-7438) Validation Error while running countApproxDistinct with relative accuracy = 0.38

2015-05-07 Thread Vinod KC (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod KC updated SPARK-7438:

Description: 
Eg Code: 
val a = sc.parallelize(1 to 1, 20)
val b = a ++ a ++ a ++ a ++ a
b.countApproxDistinct(0.38)
java.lang.IllegalArgumentException: requirement failed: p (3) must be at least 
4

Issue 1: When relative accuracy  = 0.38, IAE is thrown, as the precision p 
evaluates to 3.
However,same input in countApproxDistinctByKey(0.38), works fine. Usage of 
relativeSD should be consistent in both countApproxDistinct and 
countApproxDistinctByKey
Issue 2: Validation error message p (3) must be at least 4 is not giving a 
clue on what went wrong.
Issue 3: When relative accuracy  0.17, a proper validation error message 
is not shown from countApproxDistinct


  was:
Eg Code: 
val a = sc.parallelize(1 to 1, 20)
val b = a++a++a++a++a
b.countApproxDistinct(0.38)
java.lang.IllegalArgumentException: requirement failed: p (3) must be at least 
4

Issue 1: When relative accuracy  = 0.38, IAE is thrown, as the precision p 
evaluates to 3.
However,same input in countApproxDistinctByKey(0.38), works fine. Usage of 
relativeSD should be consistent in both countApproxDistinct and 
countApproxDistinctByKey
Issue 2: Validation error message p (3) must be at least 4 is not giving a 
clue on what went wrong.
Issue 3: When relative accuracy  0.17, a proper validation error message 
is not shown from countApproxDistinct



 Validation Error while running  countApproxDistinct  with relative accuracy  
 = 0.38  
 --

 Key: SPARK-7438
 URL: https://issues.apache.org/jira/browse/SPARK-7438
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Reporter: Vinod KC
Priority: Minor

 Eg Code: 
 val a = sc.parallelize(1 to 1, 20)
 val b = a ++ a ++ a ++ a ++ a
 b.countApproxDistinct(0.38)
 java.lang.IllegalArgumentException: requirement failed: p (3) must be at 
 least 4
 Issue 1: When relative accuracy  = 0.38, IAE is thrown, as the precision p 
 evaluates to 3.
 However,same input in countApproxDistinctByKey(0.38), works fine. Usage of 
 relativeSD should be consistent in both countApproxDistinct and 
 countApproxDistinctByKey
 Issue 2: Validation error message p (3) must be at least 4 is not giving a 
 clue on what went wrong.
 Issue 3: When relative accuracy  0.17, a proper validation error message 
 is not shown from countApproxDistinct



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-7438) Validation Error while running countApproxDistinct with relative accuracy = 0.38

2015-05-07 Thread Vinod KC (JIRA)
Vinod KC created SPARK-7438:
---

 Summary: Validation Error while running  countApproxDistinct  with 
relative accuracy  = 0.38  
 Key: SPARK-7438
 URL: https://issues.apache.org/jira/browse/SPARK-7438
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Reporter: Vinod KC
Priority: Minor


Eg Code: 
val a = sc.parallelize(1 to 1, 20)
val b = a++a++a++a++a
b.countApproxDistinct(0.38)
java.lang.IllegalArgumentException: requirement failed: p (3) must be at least 
4

Issue 1: When relative accuracy  = 0.38, IAE is thrown, as the precision p 
evaluates to 3.
However,same input in countApproxDistinctByKey(0.38), works fine. Usage of 
relativeSD should be consistent in both countApproxDistinct and 
countApproxDistinctByKey
Issue 2: Validation error message p (3) must be at least 4 is not giving a 
clue on what went wrong.
Issue 3: When relative accuracy  0.17, a proper validation error message 
is not shown from countApproxDistinct




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-7103) SparkContext.union crashed when some RDDs have no partitioner

2015-04-24 Thread Vinod KC (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510621#comment-14510621
 ] 

Vinod KC commented on SPARK-7103:
-

I closed PR #5678..
thanks

 SparkContext.union crashed when some RDDs have no partitioner
 -

 Key: SPARK-7103
 URL: https://issues.apache.org/jira/browse/SPARK-7103
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.3.0, 1.3.1
Reporter: Steven She
Priority: Minor

 I encountered a bug where Spark crashes with the following stack trace:
 {noformat}
 java.util.NoSuchElementException: None.get
   at scala.None$.get(Option.scala:313)
   at scala.None$.get(Option.scala:311)
   at 
 org.apache.spark.rdd.PartitionerAwareUnionRDD.getPartitions(PartitionerAwareUnionRDD.scala:69)
 {noformat}
 Here's a minimal example that reproduces it on the Spark shell:
 {noformat}
 val x = sc.parallelize(Seq(1-true,2-true,3-false)).partitionBy(new 
 HashPartitioner(1))
 val y = sc.parallelize(Seq(1-true))
 sc.union(y, x).count() // crashes
 sc.union(x, y).count() // This works since the first RDD has a partitioner
 {noformat}
 We had to resort to instantiating the UnionRDD directly to avoid the 
 PartitionerAwareUnionRDD.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-7055) getContextOrSparkClassLoader is not used while loading JDBC driver class

2015-04-22 Thread Vinod KC (JIRA)
Vinod KC created SPARK-7055:
---

 Summary: getContextOrSparkClassLoader is not used while loading 
JDBC driver class
 Key: SPARK-7055
 URL: https://issues.apache.org/jira/browse/SPARK-7055
 Project: Spark
  Issue Type: Bug
  Components: SQL
Reporter: Vinod KC


In JDBCRDD.scala, getConnector method uses Class.forName to load JDBC driver 
class .Instead it should use getContextOrSparkClassLoader.loadClass method
There was a resolved JIRA : SPARK-6966 , which handled same issue only in 
'JDBCRelation.scala'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6733) Suppression of usage of Scala existential code should be done

2015-04-08 Thread Vinod KC (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484841#comment-14484841
 ] 

Vinod KC commented on SPARK-6733:
-

Can you please assign this to me?

 Suppression of usage of Scala existential code should be done
 -

 Key: SPARK-6733
 URL: https://issues.apache.org/jira/browse/SPARK-6733
 Project: Spark
  Issue Type: Improvement
  Components: Scheduler
Affects Versions: 1.3.0
 Environment: OS: OSX Yosemite
 Hardware: Intel Core i7 with 16 GB RAM
Reporter: Raymond Tay
Priority: Trivial
 Fix For: 1.4.0


 The inclusion of this statement in the file 
 {code:scala}
 import scala.language.existentials
 {code}
 should have suppressed all warnings regarding the use of scala existential 
 code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-6445) IOException: Filesystem closed is thrown while existing spark-sql console

2015-03-21 Thread Vinod KC (JIRA)
Vinod KC created SPARK-6445:
---

 Summary: IOException: Filesystem closed  is thrown while existing 
spark-sql console
 Key: SPARK-6445
 URL: https://issues.apache.org/jira/browse/SPARK-6445
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.3.0
 Environment: hadoop version  2.2.0
Reporter: Vinod KC
Priority: Minor


When exit the console of spark-sql,  following exception thrown.
 
Exception in thread Thread-3 java.io.IOException: Filesystem closed 
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:629) 
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1677) 
at 
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106)
 
at 
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102)
 
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102)
 
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1397) 
at 
org.apache.spark.scheduler.EventLoggingListener.stop(EventLoggingListener.scala:196)
 
at 
org.apache.spark.SparkContext$$anonfun$stop$4.apply(SparkContext.scala:1388) 
at 
org.apache.spark.SparkContext$$anonfun$stop$4.apply(SparkContext.scala:1388) 
at scala.Option.foreach(Option.scala:236) 
at org.apache.spark.SparkContext.stop(SparkContext.scala:1388) 
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.stop(SparkSQLEnv.scala:66) 
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$$anon$1.run(SparkSQLCLIDriver.scala:107)‍
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-6317) Interactive HIVE scala console is not starting

2015-03-13 Thread Vinod KC (JIRA)
Vinod KC created SPARK-6317:
---

 Summary: Interactive HIVE scala console is not starting
 Key: SPARK-6317
 URL: https://issues.apache.org/jira/browse/SPARK-6317
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.3.0
Reporter: Vinod KC
Priority: Blocker


  build/sbt hive/console  is failing  

[info] Starting scala interpreter...
[info] 
console:15: error: object Dsl is not a member of package org.apache.spark.sql
   import org.apache.spark.sql.Dsl._
   ^




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-6223) Avoid Build warning- enable implicit value scala.language.existentials visible

2015-03-09 Thread Vinod KC (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352837#comment-14352837
 ] 

Vinod KC commented on SPARK-6223:
-

I'm working on this

 Avoid Build warning- enable implicit value scala.language.existentials visible
 --

 Key: SPARK-6223
 URL: https://issues.apache.org/jira/browse/SPARK-6223
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.3.0
Reporter: Vinod KC
Priority: Trivial

 spark/sql/core/src/main/scala/org/apache/spark/sql/sources/ddl.scala:316: 
 inferred existential type Option[(Class[_$4], 
 org.apache.spark.sql.sources.BaseRelation)] forSome { type _$4 }, which 
 cannot be expressed by wildcards,  should be enabled by making the implicit 
 value scala.language.existentials visible.
 This can be achieved by adding the import clause 'import 
 scala.language.existentials'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-6223) Avoid Build warning- enable implicit value scala.language.existentials visible

2015-03-09 Thread Vinod KC (JIRA)
Vinod KC created SPARK-6223:
---

 Summary: Avoid Build warning- enable implicit value 
scala.language.existentials visible
 Key: SPARK-6223
 URL: https://issues.apache.org/jira/browse/SPARK-6223
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 1.3.0
Reporter: Vinod KC
Priority: Trivial


spark/sql/core/src/main/scala/org/apache/spark/sql/sources/ddl.scala:316: 
inferred existential type Option[(Class[_$4], 
org.apache.spark.sql.sources.BaseRelation)] forSome { type _$4 }, which cannot 
be expressed by wildcards,  should be enabled by making the implicit value 
scala.language.existentials visible.
This can be achieved by adding the import clause 'import 
scala.language.existentials'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-6178) Remove unused imports from java classes

2015-03-04 Thread Vinod KC (JIRA)
Vinod KC created SPARK-6178:
---

 Summary: Remove unused imports from java classes 
 Key: SPARK-6178
 URL: https://issues.apache.org/jira/browse/SPARK-6178
 Project: Spark
  Issue Type: Improvement
  Components: Shuffle
Affects Versions: 1.3.0
Reporter: Vinod KC
Priority: Trivial






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



<    1   2   3