Re: Compile failure with SBT on master

2014-06-16 Thread Ted Yu
I used the same command on Linux and it passed: Linux k.net 2.6.32-220.23.1.el6.YAHOO.20120713.x86_64 #1 SMP Fri Jul 13 11:40:51 CDT 2012 x86_64 x86_64 x86_64 GNU/Linux Cheers On Mon, Jun 16, 2014 at 9:29 PM, Andrew Ash and...@andrewash.com wrote: I can't run sbt/sbt gen-idea on a clean

Re: Compile failure with SBT on master

2014-06-17 Thread Ted Yu
/RELEASE_X86_64 x86_64 On Mon, Jun 16, 2014 at 10:04 PM, Andrew Ash and...@andrewash.com wrote: Maybe it's a Mac OS X thing? On Mon, Jun 16, 2014 at 9:57 PM, Ted Yu yuzhih...@gmail.com wrote: I used the same command on Linux and it passed: Linux k.net 2.6.32-220.23.1.el6.YAHOO.20120713

Re: (send this email to subscribe)

2014-07-08 Thread Ted Yu
See http://spark.apache.org/news/spark-mailing-lists-moving-to-apache.html Cheers On Jul 8, 2014, at 4:17 AM, Leon Zhang leonca...@gmail.com wrote:

Re: (send this email to subscribe)

2014-07-08 Thread Ted Yu
This is the correct page: http://spark.apache.org/community.html Cheers On Jul 8, 2014, at 4:43 AM, Ted Yu yuzhih...@gmail.com wrote: See http://spark.apache.org/news/spark-mailing-lists-moving-to-apache.html Cheers On Jul 8, 2014, at 4:17 AM, Leon Zhang leonca...@gmail.com wrote:

Re: [VOTE] Release Apache Spark 1.0.2 (RC1)

2014-07-25 Thread Ted Yu
HADOOP-10456 is fixed in hadoop 2.4.1 Does this mean that synchronization on HadoopRDD.CONFIGURATION_INSTANTIATION_LOCK can be bypassed for hadoop 2.4.1 ? Cheers On Fri, Jul 25, 2014 at 6:00 PM, Patrick Wendell pwend...@gmail.com wrote: The most important issue in this release is actually an

Re: Working Formula for Hive 0.13?

2014-07-28 Thread Ted Yu
I found 0.13.1 artifacts in maven: http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-metastore%7C0.13.1%7Cjar However, Spark uses groupId of org.spark-project.hive, not org.apache.hive Can someone tell me how it is supposed to work ? Cheers On Mon, Jul 28, 2014 at 7:44 AM,

Re: Working Formula for Hive 0.13?

2014-07-28 Thread Ted Yu
? I am no expert on the Hive artifacts, just remembering what the issue was initially in case it helps you get to a similar solution. On Mon, Jul 28, 2014 at 4:47 PM, Ted Yu yuzhih...@gmail.com wrote: hive-exec (as of 0.13.1) is published here: http://search.maven.org/#artifactdetails

Re: Working Formula for Hive 0.13?

2014-07-28 Thread Ted Yu
can fix that issue. If not, we'll have to continue forking our own version of Hive to change the way it publishes artifacts. - Patrick On Mon, Jul 28, 2014 at 9:34 AM, Ted Yu yuzhih...@gmail.com wrote: Talked with Owen offline. He confirmed that as of 0.13, hive-exec is still uber jar

Re: Working Formula for Hive 0.13?

2014-07-28 Thread Ted Yu
: AFAIK, according a recent talk, Hulu team in China has built Spark SQL against Hive 0.13 (or 0.13.1?) successfully. Basically they also re-packaged Hive 0.13 as what the Spark team did. The slides of the talk hasn't been released yet though. On Tue, Jul 29, 2014 at 1:01 AM, Ted Yu

Re: Working Formula for Hive 0.13?

2014-07-28 Thread Ted Yu
PM, Ted Yu yuzhih...@gmail.com wrote: After manually copying hive 0.13.1 jars to local maven repo, I got the following errors when building spark-hive_2.10 module : [ERROR] /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala:182: type mismatch; found

Re: Working Formula for Hive 0.13?

2014-07-28 Thread Ted Yu
have is, What is the goal of upgrading to hive 0.13.0? Is it purely because you are having problems connecting to newer metastores? Are there some features you are hoping for? This will help me prioritize this effort. Michael On Mon, Jul 28, 2014 at 4:05 PM, Ted Yu yuzhih...@gmail.com wrote

Re: subscribe dev list for spark

2014-07-30 Thread Ted Yu
See Mailing list section of: https://spark.apache.org/community.html On Wed, Jul 30, 2014 at 6:53 PM, Grace syso...@gmail.com wrote:

Re: failed to build spark with maven for both 1.0.1 and latest master branch

2014-07-31 Thread Ted Yu
The following command succeeded (on Linux) on Spark master checked out this morning: mvn -Pyarn -Phive -Phadoop-2.4 -DskipTests install FYI On Thu, Jul 31, 2014 at 1:36 PM, yao yaosheng...@gmail.com wrote: Hi TD, I've asked my colleagues to do the same thing but compile still fails.

compilation error in Catalyst module

2014-08-06 Thread Ted Yu
I refreshed my workspace. I got the following error with this command: mvn -Pyarn -Phive -Phadoop-2.4 -DskipTests install [ERROR] bad symbolic reference. A signature in package.class refers to term scalalogging in package com.typesafe which is not available. It may be completely missing from the

Re: compilation error in Catalyst module

2014-08-06 Thread Ted Yu
Forgot to do that step. Now compilation passes. On Wed, Aug 6, 2014 at 1:36 PM, Zongheng Yang zonghen...@gmail.com wrote: Hi Ted, By refreshing do you mean you have done 'mvn clean'? On Wed, Aug 6, 2014 at 1:17 PM, Ted Yu yuzhih...@gmail.com wrote: I refreshed my workspace. I got

Re: Unit tests in 5 minutes

2014-08-08 Thread Ted Yu
How about using parallel execution feature of maven-surefire-plugin (assuming all the tests were made parallel friendly) ? http://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html Cheers On Fri, Aug 8, 2014 at 9:14 AM, Sean Owen

reference to dstream in package org.apache.spark.streaming which is not available

2014-08-22 Thread Ted Yu
Hi, Using the following command on (refreshed) master branch: mvn clean package -DskipTests I got: constituent[36]: file:/homes/hortonzy/apache-maven-3.1.1/conf/logging/ --- java.lang.reflect.InvocationTargetException at

Re: Dependency hell in Spark applications

2014-09-05 Thread Ted Yu
From output of dependency:tree: [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ spark-streaming_2.10 --- [INFO] org.apache.spark:spark-streaming_2.10:jar:1.1.0-SNAPSHOT INFO] +- org.apache.spark:spark-core_2.10:jar:1.1.0-SNAPSHOT:compile [INFO] | +-

BasicOperationsSuite failing ?

2014-09-29 Thread Ted Yu
Hi, Running test suite in trunk, I got: ^[[32mBasicOperationsSuite:^[[0m ^[[32m- map^[[0m ^[[32m- flatMap^[[0m ^[[32m- filter^[[0m ^[[32m- glom^[[0m ^[[32m- mapPartitions^[[0m ^[[32m- repartition (more partitions)^[[0m ^[[32m- repartition (fewer partitions)^[[0m ^[[32m- groupByKey^[[0m ^[[32m-

Re: Extending Scala style checks

2014-10-01 Thread Ted Yu
Please take a look at WhitespaceEndOfLineChecker under: http://www.scalastyle.org/rules-0.1.0.html Cheers On Wed, Oct 1, 2014 at 2:01 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: As discussed here https://github.com/apache/spark/pull/2619, it would be good to extend our Scala style

Re: something wrong with Jenkins or something untested merged?

2014-10-20 Thread Ted Yu
I performed build on latest master branch but didn't get compilation error. FYI On Mon, Oct 20, 2014 at 3:51 PM, Nan Zhu zhunanmcg...@gmail.com wrote: Hi, I just submitted a patch https://github.com/apache/spark/pull/2864/files with one line change but the Jenkins told me it's failed to

Re: scalastyle annoys me a little bit

2014-10-23 Thread Ted Yu
Koert: Have you tried adding the following on your commandline ? -Dscalastyle.failOnViolation=false Cheers On Thu, Oct 23, 2014 at 11:07 AM, Patrick Wendell pwend...@gmail.com wrote: Hey Koert, I think disabling the style checks in maven package could be a good idea for the reason you

Re: scalastyle annoys me a little bit

2014-10-23 Thread Ted Yu
goal org.scalastyle:scalastyle-maven-plugin:0.4.0:check (default) on project spark-core_2.10: Failed during scalastyle execution: You have 3 Scalastyle violation(s). - [Help 1] On Thu, Oct 23, 2014 at 2:14 PM, Ted Yu yuzhih...@gmail.com wrote: Koert: Have you tried adding the following

Re: scalastyle annoys me a little bit

2014-10-23 Thread Ted Yu
Created SPARK-4066 and attached patch there. On Thu, Oct 23, 2014 at 1:07 PM, Koert Kuipers ko...@tresata.com wrote: great thanks i will do that On Thu, Oct 23, 2014 at 3:55 PM, Ted Yu yuzhih...@gmail.com wrote: Koert: If you have time, you can try this diff - with which you would be able

Re: create_image.sh contains broken hadoop web link

2014-11-05 Thread Ted Yu
Have you seen this thread ? http://search-hadoop.com/m/LgpTk2Pnw6O/andrew+apache+mirrorsubj=Re+All+mirrored+download+links+from+the+Apache+Hadoop+site+are+broken Cheers On Wed, Nov 5, 2014 at 7:36 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: As part of my work for SPARK-3821

Re: create_image.sh contains broken hadoop web link

2014-11-05 Thread Ted Yu
pointed to also appears to be broken now: http://apache.mesi.com.ar/hadoop/common/ Nick On Wed, Nov 5, 2014 at 10:43 PM, Ted Yu yuzhih...@gmail.com wrote: Have you seen this thread ? http://search-hadoop.com/m/LgpTk2Pnw6O/andrew+apache+mirrorsubj=Re+All+mirrored+download+links+from+the+Apache

Re: Has anyone else observed this build break?

2014-11-15 Thread Ted Yu
Sorry for the late reply. I tested my patch on Mac with the following JDK: java version 1.7.0_60 Java(TM) SE Runtime Environment (build 1.7.0_60-b19) Java HotSpot(TM) 64-Bit Server VM (build 24.60-b09, mixed mode) Let me see if the problem can be solved upstream in HBase hbase-annotations

Re: Has anyone else observed this build break?

2014-11-15 Thread Ted Yu
it from various hbase modules: https://github.com/apache/spark/pull/3286 Cheers https://github.com/apache/spark/pull/3286 On Sat, Nov 15, 2014 at 6:56 AM, Ted Yu yuzhih...@gmail.com wrote: Sorry for the late reply. I tested my patch on Mac with the following JDK: java version 1.7.0_60 Java

Re: How spark and hive integrate in long term?

2014-11-21 Thread Ted Yu
bq. spark-0.12 also has some nice feature added Minor correction: you meant Spark 1.2.0 I guess Cheers On Fri, Nov 21, 2014 at 3:45 PM, Zhan Zhang zzh...@hortonworks.com wrote: Thanks Dean, for the information. Hive-on-spark is nice. Spark sql has the advantage to take the full advantage

Re: Required file not found in building

2014-12-01 Thread Ted Yu
I tried the same command on MacBook and didn't experience the same error. Which OS are you using ? Cheers On Mon, Dec 1, 2014 at 6:42 PM, Stephen Boesch java...@gmail.com wrote: It seems there were some additional settings required to build spark now . This should be a snap for most of you

Re: Required file not found in building

2014-12-01 Thread Ted Yu
zip for 0.3.5.3 was downloaded and exploded. Then I ran sbt dist/create . zinc is being launched from dist/target/zinc-0.3.5.3/bin/zinc 2014-12-01 20:12 GMT-08:00 Ted Yu yuzhih...@gmail.com: I use zinc 0.2.0 and started zinc with the same command shown below. I don't observe such error

Re: Unit tests in 5 minutes

2014-12-04 Thread Ted Yu
Have you seen this thread http://search-hadoop.com/m/JW1q5xxSAa2 ? Test categorization in HBase is done through maven-surefire-plugin Cheers On Thu, Dec 4, 2014 at 4:05 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: fwiw, when we did this work in HBase, we categorized the tests. Then

Re: Unit tests in 5 minutes

2014-12-06 Thread Ted Yu
bq. I may move on to trying Maven. Maven is my favorite :-) On Sat, Dec 6, 2014 at 10:54 AM, Nicholas Chammas nicholas.cham...@gmail.com wrote: Ted, I posted some updates

Re: Nabble mailing list mirror errors: This post has NOT been accepted by the mailing list yet

2014-12-19 Thread Ted Yu
Andy: I saw two emails from you from yesterday. See this thread: http://search-hadoop.com/m/JW1q5opRsY1 Cheers On Fri, Dec 19, 2014 at 12:51 PM, Andy Konwinski andykonwin...@gmail.com wrote: Yesterday, I changed the domain name in the mailing list archive settings to remove .incubator so

Re: Assembly jar file name does not match profile selection

2014-12-26 Thread Ted Yu
Can you try this command ? sbt/sbt -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive assembly On Fri, Dec 26, 2014 at 6:15 PM, Alessandro Baretta alexbare...@gmail.com wrote: I am building spark with sbt off of branch 1.2. I'm using the following command: sbt/sbt -Pyarn -Phadoop-2.3

Re: Why the major.minor version of the new hive-exec is 51.0?

2014-12-30 Thread Ted Yu
I extracted org/apache/hadoop/hive/common/CompressionUtils.class from the jar and used hexdump to view the class file. Bytes 6 and 7 are 00 and 33, respectively. According to http://en.wikipedia.org/wiki/Java_class_file, the jar was produced using Java 7. FYI On Tue, Dec 30, 2014 at 8:09 PM,

Re: Welcoming three new committers

2015-02-03 Thread Ted Yu
Congratulations, Cheng, Joseph and Sean. On Tue, Feb 3, 2015 at 2:53 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: Congratulations guys! On Tue Feb 03 2015 at 2:36:12 PM Matei Zaharia matei.zaha...@gmail.com wrote: Hi all, The PMC recently voted to add three new committers:

Re: Standardized Spark dev environment

2015-01-20 Thread Ted Yu
How many profiles (hadoop / hive /scala) would this development environment support ? Cheers On Tue, Jan 20, 2015 at 4:13 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: What do y'all think of creating a standardized Spark development environment, perhaps encoded as a Vagrantfile, and

Re: run time exceptions in Spark 1.2.0 manual build together with OpenStack hadoop driver

2015-01-18 Thread Ted Yu
Please tale a look at SPARK-4048 and SPARK-5108 Cheers On Sat, Jan 17, 2015 at 10:26 PM, Gil Vernik g...@il.ibm.com wrote: Hi, I took a source code of Spark 1.2.0 and tried to build it together with hadoop-openstack.jar ( To allow Spark an access to OpenStack Swift ) I used Hadoop 2.6.0.

Re: 1.2.1 start-all.sh broken?

2015-02-11 Thread Ted Yu
nicholas.cham...@gmail.com wrote: lol yeah, I changed the path for the email... turned out to be the issue itself. On Wed Feb 11 2015 at 2:43:09 PM Ted Yu yuzhih...@gmail.com wrote: I see. '/path/to/spark-1.2.1-bin-hadoop2.4' didn't contain space :-) On Wed, Feb 11, 2015 at 2:41 PM, Nicholas

Re: 1.2.1 start-all.sh broken?

2015-02-11 Thread Ted Yu
I downloaded 1.2.1 tar ball for hadoop 2.4 I got: ls lib/ datanucleus-api-jdo-3.2.6.jar datanucleus-rdbms-3.2.9.jar spark-assembly-1.2.1-hadoop2.4.0.jar datanucleus-core-3.2.10.jarspark-1.2.1-yarn-shuffle.jar spark-examples-1.2.1-hadoop2.4.0.jar FYI On Wed, Feb 11, 2015 at 2:27 PM,

Re: 1.2.1 start-all.sh broken?

2015-02-11 Thread Ted Yu
spark-assembly-1.2.1-hadoop2.4.0.jar spark-examples-1.2.1-hadoop2.4.0.jar So that looks correct… Hmm. Nick ​ On Wed Feb 11 2015 at 2:34:51 PM Ted Yu yuzhih...@gmail.com wrote: I downloaded 1.2.1 tar ball for hadoop 2.4 I got: ls lib/ datanucleus-api-jdo-3.2.6.jar datanucleus-rdbms

Re: Intellij IDEA 14 env setup; NoClassDefFoundError when run examples

2015-01-31 Thread Ted Yu
Have you read / followed this ? https://cwiki.apache.org/confluence/display/SPARK /Useful+Developer+Tools#UsefulDeveloperTools-BuildingSparkinIntelliJIDEA Cheers On Sat, Jan 31, 2015 at 8:01 PM, Yafeng Guo daniel.yafeng@gmail.com wrote: Hi, I'm setting up a dev environment with Intellij

Re: python converter in HBaseConverter.scala(spark/examples)

2015-01-05 Thread Ted Yu
converters would be part of external projects that can be listed with http://spark-packages.org/ I see your project is already listed there. — Sent from Mailbox https://www.dropbox.com/mailbox On Mon, Jan 5, 2015 at 5:37 PM, Ted Yu yuzhih...@gmail.com wrote: In my opinion this would be useful

Re: Results of tests

2015-01-09 Thread Ted Yu
/ De : Ted Yu [yuzhih...@gmail.com] Envoyé : jeudi 8 janvier 2015 17:43 À : Tony Reix Cc : dev@spark.apache.org Objet : Re: Results of tests Here it is: [centos] $ /home/jenkins/tools/hudson.tasks.Maven_MavenInstallation/Maven_3.0.5/bin/mvn -DHADOOP_PROFILE=hadoop-2.4 -Dlabel=centos

Re: Results of tests

2015-01-09 Thread Ted Yu
might be able to integrate the PySpark tests here, too (I think it's just a matter of getting the Python test runner to generate the correct test result XML output). On Fri, Jan 9, 2015 at 10:47 AM, Ted Yu yuzhih...@gmail.com wrote: For a build which uses JUnit, we would see a summary

Re: python converter in HBaseConverter.scala(spark/examples)

2015-01-05 Thread Ted Yu
In my opinion this would be useful - there was another thread where returning only the value of first column in the result was mentioned. Please create a SPARK JIRA and a pull request. Cheers On Mon, Jan 5, 2015 at 6:42 AM, tgbaggio gen.tan...@gmail.com wrote: Hi, In HBaseConverter.scala

Re: Results of tests

2015-01-08 Thread Ted Yu
=centos/testReport/ ? (I'm not authorized to look at the configuration part) Thx ! Tony -- *De :* Ted Yu [yuzhih...@gmail.com] *Envoyé :* jeudi 8 janvier 2015 16:11 *À :* Tony Reix *Cc :* dev@spark.apache.org *Objet :* Re: Results of tests Please take a look

Re: Wrong version on the Spark documentation page

2015-03-15 Thread Ted Yu
When I enter http://spark.apache.org/docs/latest/ into Chrome address bar, I saw 1.3.0 Cheers On Sun, Mar 15, 2015 at 11:12 AM, Patrick Wendell pwend...@gmail.com wrote: Cheng - what if you hold shift+refresh? For me the /latest link correctly points to 1.3.0 On Sun, Mar 15, 2015 at 10:40

Re: Error: 'SparkContext' object has no attribute 'getActiveStageIds'

2015-03-20 Thread Ted Yu
Please take a look at core/src/main/scala/org/apache/spark/SparkStatusTracker.scala, around line 58: def getActiveStageIds(): Array[Int] = { Cheers On Fri, Mar 20, 2015 at 3:59 PM, xing ehomec...@gmail.com wrote: getStageInfo in self._jtracker.getStageInfo below seems not

Re: GitHub Syncing Down

2015-03-11 Thread Ted Yu
Looks like github is functioning again (I no longer encounter this problem when pushing to hbase repo). Do you want to give it a try ? Cheers On Tue, Mar 10, 2015 at 6:54 PM, Michael Armbrust mich...@databricks.com wrote: FYI: https://issues.apache.org/jira/browse/INFRA-9259

Re: Jira Issues

2015-03-25 Thread Ted Yu
Issues are tracked on Apache JIRA: https://issues.apache.org/jira/browse/SPARK/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel Cheers On Wed, Mar 25, 2015 at 1:51 PM, Igor Costa igorco...@apache.org wrote: Hi there Guys. I want to be more collaborative to Spark, but I have

Re: should we add a start-masters.sh script in sbin?

2015-03-31 Thread Ted Yu
Sounds good to me. On Tue, Mar 31, 2015 at 6:12 PM, sequoiadb mailing-list-r...@sequoiadb.com wrote: Hey, start-slaves.sh script is able to read from slaves file and start slaves node in multiple boxes. However in standalone mode if I want to use multiple masters, I’ll have to start

Re: One corrupt gzip in a directory of 100s

2015-04-01 Thread Ted Yu
bq. writing the output (to Amazon S3) failed What's the value of fs.s3.maxRetries ? Increasing the value should help. Cheers On Wed, Apr 1, 2015 at 8:34 AM, Romi Kuntsman r...@totango.com wrote: What about communication errors and not corrupted files? Both when reading input and when writing

Re: trouble with sbt building network-* projects?

2015-02-27 Thread Ted Yu
bq. to be able to run my tests in sbt, though, it makes the development iterations much faster. Was the preference for sbt due to long maven build time ? Have you started Zinc on your machine ? Cheers On Fri, Feb 27, 2015 at 11:10 AM, Imran Rashid iras...@cloudera.com wrote: Has anyone else

Re: trouble with sbt building network-* projects?

2015-02-27 Thread Ted Yu
a full rebuild of those projects even when I haven't touched them. On Fri, Feb 27, 2015 at 1:14 PM, Ted Yu yuzhih...@gmail.com wrote: bq. to be able to run my tests in sbt, though, it makes the development iterations much faster. Was the preference for sbt due to long maven build time

Re: org.spark-project.jetty and guava repo locations

2015-04-02 Thread Ted Yu
Take a look at the maven-shade-plugin in pom.xml. Here is the snippet for org.spark-project.jetty : relocation patternorg.eclipse.jetty/pattern shadedPatternorg.spark-project.jetty/shadedPattern includes

Re: [sql] Dataframe how to check null values

2015-04-20 Thread Ted Yu
I found: https://issues.apache.org/jira/browse/SPARK-6573 On Apr 20, 2015, at 4:29 AM, Peter Rudenko petro.rude...@gmail.com wrote: Sounds very good. Is there a jira for this? Would be cool to have in 1.4, because currently cannot use dataframe.describe function with NaN values, need to

Re: [discuss] DataFrame function namespacing

2015-04-30 Thread Ted Yu
IMHO I would go with choice #1 Cheers On Wed, Apr 29, 2015 at 10:03 PM, Reynold Xin r...@databricks.com wrote: We definitely still have the name collision problem in SQL. On Wed, Apr 29, 2015 at 10:01 PM, Punyashloka Biswal punya.bis...@gmail.com wrote: Do we still have to keep the

Re: [discuss] ending support for Java 6?

2015-04-30 Thread Ted Yu
+1 on ending support for Java 6. BTW from https://www.java.com/en/download/faq/java_7.xml : After April 2015, Oracle will no longer post updates of Java SE 7 to its public download sites. On Thu, Apr 30, 2015 at 1:34 PM, Punyashloka Biswal punya.bis...@gmail.com wrote: I'm in favor of ending

Re: [discuss] ending support for Java 6?

2015-05-02 Thread Ted Yu
+1 On Sat, May 2, 2015 at 1:09 PM, Mridul Muralidharan mri...@gmail.com wrote: We could build on minimum jdk we support for testing pr's - which will automatically cause build failures in case code uses newer api ? Regards, Mridul On Fri, May 1, 2015 at 2:46 PM, Reynold Xin

Re: Speeding up Spark build during development

2015-05-01 Thread Ted Yu
Pramod: Please remember to run Zinc so that the build is faster. Cheers On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander alexander.ula...@hp.com wrote: Hi Pramod, For cluster-like tests you might want to use the same code as in mllib's LocalClusterSparkContext. You can rebuild only the

Re: Mima test failure in the master branch?

2015-04-30 Thread Ted Yu
Looks like this has been taken care of: commit beeafcfd6ee1e460c4d564cd1515d8781989b422 Author: Patrick Wendell patr...@databricks.com Date: Thu Apr 30 20:33:36 2015 -0700 Revert [SPARK-5213] [SQL] Pluggable SQL Parser Support On Thu, Apr 30, 2015 at 7:58 PM, zhazhan

Re: [discuss] ending support for Java 6?

2015-04-30 Thread Ted Yu
But it is hard to know how long customers stay with their most recent download. Cheers On Thu, Apr 30, 2015 at 2:26 PM, Sree V sree_at_ch...@yahoo.com.invalid wrote: If there is any possibility of getting the download counts,then we can use it as EOS criteria as well.Say, if download counts

Re: unable to extract tgz files downloaded from spark

2015-05-06 Thread Ted Yu
From which site did you download the tar ball ? Which package type did you choose (pre-built for which distro) ? Thanks On Wed, May 6, 2015 at 7:16 PM, Praveen Kumar Muthuswamy muthusamy...@gmail.com wrote: Hi I have been trying to install latest spark verison and downloaded the .tgz

Re: Recent Spark test failures

2015-05-11 Thread Ted Yu
actually the worst if tests fail sometimes but not others, because we can't reproduce them deterministically. Using -M and -A actually tolerates flaky tests to a certain extent, and I would prefer to instead increase the determinism in these tests. -Andrew 2015-05-08 17:56 GMT-07:00 Ted Yu yuzhih

Re: [PySpark DataFrame] When a Row is not a Row

2015-05-11 Thread Ted Yu
In Row#equals(): while (i len) { if (apply(i) != that.apply(i)) { '!=' should be !apply(i).equals(that.apply(i)) ? Cheers On Mon, May 11, 2015 at 1:49 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: This is really strange. # Spark 1.3.1 print type(results) class

Re: Build fail...

2015-05-08 Thread Ted Yu
Looks like you're right: https://amplab.cs.berkeley.edu/jenkins/view/Spark/job/Spark-1.3-Maven-with-YARN/HADOOP_PROFILE=hadoop-2.4,label=centos/427/console [error]

Re: jackson.databind exception in RDDOperationScope.jsonMapper.writeValueAsString(this)

2015-05-06 Thread Ted Yu
Looks like mismatch of jackson version. Spark uses: fasterxml.jackson.version2.4.4/fasterxml.jackson.version FYI On Wed, May 6, 2015 at 8:00 AM, A.M.Chan kaka_1...@163.com wrote: Hey, guys. I meet this exception while testing SQL/Columns. I didn't change the pom or the core project. In

Re: Recent Spark test failures

2015-05-08 Thread Ted Yu
Andrew: Do you think the -M and -A options described here can be used in test runs ? http://scalatest.org/user_guide/using_the_runner Cheers On Wed, May 6, 2015 at 5:41 PM, Andrew Or and...@databricks.com wrote: Dear all, I'm sure you have all noticed that the Spark tests have been fairly

Re: How to link code pull request with JIRA ID?

2015-05-13 Thread Ted Yu
Subproject tag should follow SPARK JIRA number. e.g. [SPARK-5277][SQL] ... Cheers On Wed, May 13, 2015 at 11:50 AM, Stephen Boesch java...@gmail.com wrote: following up from Nicholas, it is [SPARK-12345] Your PR description where 12345 is the jira number. One thing I tend to forget is

Re: Recent Spark test failures

2015-05-15 Thread Ted Yu
lately: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/ Maybe PR builder doesn't build against hadoop 2.4 ? Cheers On Mon, May 11, 2015 at 1:11 PM, Ted Yu yuzhih...@gmail.com wrote: Makes sense. Having high determinism in these tests would make Jenkins build stable

Re: Recent Spark test failures

2015-05-15 Thread Ted Yu
, 2015 at 9:23 AM, Ted Yu yuzhih...@gmail.com wrote: Jenkins build against hadoop 2.4 has been unstable recently: https://amplab.cs.berkeley.edu/jenkins/view/Spark/job/Spark-Master-Maven-with-YARN/HADOOP_PROFILE=hadoop-2.4,label=centos/ I haven't found the test which hung / failed in recent Jenkins

Re: Recent Spark test failures

2015-05-15 Thread Ted Yu
] Running Spark tests with these arguments: -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Pkinesis-asl test Is anyone testing individual pull requests against Hadoop 2.4 or 2.6 before the code is declared clean? Fred [image: Inactive hide details for Ted Yu ---05/15/2015 09:29:09 AM---Jenkins

Re: how long does it takes for full build ?

2015-04-16 Thread Ted Yu
with spilling, bypass merge-sort Any pointers ? Thanking you. With Regards Sree On Thursday, April 16, 2015 12:01 PM, Ted Yu yuzhih...@gmail.com wrote: You can get some idea by looking at the builds here: https://amplab.cs.berkeley.edu/jenkins/view/Spark/job/Spark-1.2-Maven

Re: [Spark SQL] Java map/flatMap api broken with DataFrame in 1.3.{0,1}

2015-04-17 Thread Ted Yu
The image didn't go through. I think you were referring to: override def map[R: ClassTag](f: Row = R): RDD[R] = rdd.map(f) Cheers On Fri, Apr 17, 2015 at 6:07 AM, Olivier Girardot o.girar...@lateral-thoughts.com wrote: Hi everyone, I had an issue trying to use Spark SQL from Java (8 or

Re: wait time between start master and start slaves

2015-04-11 Thread Ted Yu
From SparkUI.scala : def getUIPort(conf: SparkConf): Int = { conf.getInt(spark.ui.port, SparkUI.DEFAULT_PORT) } Better retrieve effective UI port before probing. Cheers On Sat, Apr 11, 2015 at 2:38 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: So basically, to tell if the

Re: Anyone facing problem in incremental building of individual project

2015-06-04 Thread Ted Yu
Andrew Or put in this workaround : diff --git a/pom.xml b/pom.xml index 0b1aaad..d03d33b 100644 --- a/pom.xml +++ b/pom.xml @@ -1438,6 +1438,8 @@ version2.3/version configuration shadedArtifactAttachedfalse/shadedArtifactAttached + !-- Work around MSHADE-148

Re: [VOTE] Release Apache Spark 1.4.1

2015-06-26 Thread Ted Yu
I got the following when running test suite: [INFO] compiler plugin: BasicArtifact(org.scalamacros,paradise_2.10.4,2.0.1,null) ^[[0m[^[[0minfo^[[0m] ^[[0mCompiling 2 Scala sources and 1 Java source to /home/hbase/spark-1.4.1/streaming/target/scala-2.10/test-classes...^[[0m ^[[0m[^[[31merror^[[0m]

Re: [VOTE] Release Apache Spark 1.4.1

2015-06-26 Thread Ted Yu
(OutcomeOf.scala:85)^[[0m The error from previous email was due to absence of StreamingContextSuite.scala On Fri, Jun 26, 2015 at 1:27 PM, Ted Yu yuzhih...@gmail.com wrote: I got the following when running test suite: [INFO] compiler plugin: BasicArtifact(org.scalamacros,paradise_2.10.4,2.0.1

Re: problem with using mapPartitions

2015-05-30 Thread Ted Yu
bq. val result = fDB.mappartitions(testMP).collect Not sure if you pasted the above code - there was a typo: method name should be mapPartitions Cheers On Sat, May 30, 2015 at 9:44 AM, unioah uni...@gmail.com wrote: Hi, I try to aggregate the value in each partition internally. For

StreamingContextSuite fails with NoSuchMethodError

2015-05-29 Thread Ted Yu
Hi, I ran the following command on 1.4.0 RC3: mvn -Phadoop-2.4 -Dhadoop.version=2.7.0 -Pyarn -Phive package I saw the following failure: ^[[32mStreamingContextSuite:^[[0m ^[[32m- from no conf constructor^[[0m ^[[32m- from no conf + spark home^[[0m ^[[32m- from no conf + spark home + env^[[0m

Re: StreamingContextSuite fails with NoSuchMethodError

2015-05-30 Thread Ted Yu
I downloaded source tar ball and ran command similar to following with: clean package -DskipTests Then I ran the following command. Fyi On May 30, 2015, at 12:42 AM, Tathagata Das t...@databricks.com wrote: Did was it a clean compilation? TD On Fri, May 29, 2015 at 10:48 PM, Ted

Re: Can not build master

2015-07-03 Thread Ted Yu
are passing on Jenkins so I wonder if it's a maven version issue: https://amplab.cs.berkeley.edu/jenkins/view/Spark-QA-Compile/ - Patrick On Fri, Jul 3, 2015 at 3:14 PM, Ted Yu yuzhih...@gmail.com wrote: Please take a look at SPARK-8781 (https://github.com/apache/spark/pull/7193

Re: Can not build master

2015-07-03 Thread Ted Yu
This is what I got (the last line was repeated non-stop): [INFO] Replacing original artifact with shaded artifact. [INFO] Replacing /home/hbase/spark/bagel/target/spark-bagel_2.10-1.5.0-SNAPSHOT.jar with /home/hbase/spark/bagel/target/spark-bagel_2.10-1.5.0-SNAPSHOT-shaded.jar [INFO]

Re: [VOTE] Release Apache Spark 1.4.1 (RC2)

2015-07-03 Thread Ted Yu
Patrick: I used the following command: ~/apache-maven-3.3.1/bin/mvn -DskipTests -Phadoop-2.4 -Pyarn -Phive clean package The build doesn't seem to stop. Here is tail of build output: [INFO] Dependency-reduced POM written at: /home/hbase/spark-1.4.1/bagel/dependency-reduced-pom.xml [INFO]

Re: [VOTE] Release Apache Spark 1.4.1

2015-06-29 Thread Ted Yu
Here is the command I used: mvn -Phadoop-2.4 -Dhadoop.version=2.7.0 -Pyarn -Phive package Java: 1.8.0_45 OS: Linux x.com 2.6.32-504.el6.x86_64 #1 SMP Wed Oct 15 04:27:16 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Cheers On Mon, Jun 29, 2015 at 12:04 AM, Tathagata Das tathagata.das1...@gmail.com

Re: Spark 1.5.0-SNAPSHOT broken with Scala 2.11

2015-06-28 Thread Ted Yu
Spark-Master-Scala211-Compile build is green. However it is not clear what the actual command is: [EnvInject] - Variables injected successfully. [Spark-Master-Scala211-Compile] $ /bin/bash /tmp/hudson8945334776362889961.sh FYI On Sun, Jun 28, 2015 at 6:02 PM, Alessandro Baretta

Re: Kryo option changed

2015-05-24 Thread Ted Yu
, 2015 at 6:37 PM, Ted Yu yuzhih...@gmail.com wrote: Pardon me. Please use '8192k' Cheers On Sat, May 23, 2015 at 6:24 PM, Debasish Das debasish.da...@gmail.com wrote: Tried 8mb...still I am failing on the same error... On Sat, May 23, 2015 at 6:10 PM, Ted Yu yuzhih...@gmail.com wrote

Re: Kryo option changed

2015-05-23 Thread Ted Yu
bq. it shuld be 8mb Please use the above syntax. Cheers On Sat, May 23, 2015 at 6:04 PM, Debasish Das debasish.da...@gmail.com wrote: Hi, I am on last week's master but all the examples that set up the following .set(spark.kryoserializer.buffer, 8m) are failing with the following error:

Re: Kryo option changed

2015-05-23 Thread Ted Yu
Pardon me. Please use '8192k' Cheers On Sat, May 23, 2015 at 6:24 PM, Debasish Das debasish.da...@gmail.com wrote: Tried 8mb...still I am failing on the same error... On Sat, May 23, 2015 at 6:10 PM, Ted Yu yuzhih...@gmail.com wrote: bq. it shuld be 8mb Please use the above syntax

Re: [IMPORTANT] Committers please update merge script

2015-05-23 Thread Ted Yu
INFRA-9646 has been resolved. FYI On Wed, May 13, 2015 at 6:00 PM, Patrick Wendell pwend...@gmail.com wrote: Hi All - unfortunately the fix introduced another bug, which is that fixVersion was not updated properly. I've updated the script and had one other person test it. So committers

Re: Unable to build from assembly

2015-05-22 Thread Ted Yu
What version of Java do you use ? Can you run this command first ? build/sbt clean BTW please see [SPARK-7498] [MLLIB] add varargs back to setDefault Cheers On Fri, May 22, 2015 at 7:34 AM, Manoj Kumar manojkumarsivaraj...@gmail.com wrote: Hello, I updated my master from upstream

Re: 答复: 答复: Package Release Annoucement: Spark SQL on HBase Astro

2015-08-11 Thread Ted Yu
, *From:* Ted Yu [mailto:yuzhih...@gmail.com] *Sent:* Tuesday, August 11, 2015 3:28 PM *To:* Yan Zhou.sc *Cc:* Bing Xiao (Bing); dev@spark.apache.org; u...@spark.apache.org *Subject:* Re: 答复: Package Release Annoucement: Spark SQL on HBase Astro HBase will not have query engine

Re: [VOTE] Release Apache Spark 1.5.0 (RC1)

2015-08-21 Thread Ted Yu
I pointed hbase-spark module (in HBase project) to 1.5.0-rc1 and was able to build the module (with proper maven repo). FYI On Fri, Aug 21, 2015 at 2:17 PM, mkhaitman mark.khait...@chango.com wrote: Just a heads up that this RC1 release is still appearing as 1.5.0-SNAPSHOT (Not just me

Re: What's the best practice for developing new features for spark ?

2015-08-19 Thread Ted Yu
See this thread: http://search-hadoop.com/m/q3RTtdZv0d1btRHl/Spark+build+modulesubj=Building+Spark+Building+just+one+module+ On Aug 19, 2015, at 1:44 AM, canan chen ccn...@gmail.com wrote: I want to work on one jira, but it is not easy to do unit test, because it involves different

Re: [VOTE] Release Apache Spark 1.4.1

2015-06-29 Thread Ted Yu
The test passes when run alone on my machine as well. Please run test suite. Thanks On Mon, Jun 29, 2015 at 2:01 PM, Tathagata Das tathagata.das1...@gmail.com wrote: @Ted, I ran the following two commands. mvn -Phadoop-2.4 -Dhadoop.version=2.7.0 -Pyarn -Phive -DskipTests clean package mvn

Re: [VOTE] Release Apache Spark 1.4.1

2015-06-29 Thread Ted Yu
that this uncovers a real bug. Even if it does I would not block the release on it because many in the community are waiting for a few important fixes. In general, there will always be outstanding issues in Spark that we cannot address in every release. -Andrew 2015-06-29 14:29 GMT-07:00 Ted Yu yuzhih

Re: add to user list

2015-07-30 Thread Ted Yu
Please take a look at the first section of: https://spark.apache.org/community On Thu, Jul 30, 2015 at 9:23 PM, Sachin Aggarwal different.sac...@gmail.com wrote: -- Thanks Regards Sachin Aggarwal 7760502772

Re: High availability with zookeeper: worker discovery

2015-07-30 Thread Ted Yu
zookeeper is not a direct dependency of Spark. Can you give a bit more detail on how the election / discovery of master works ? Cheers On Thu, Jul 30, 2015 at 7:41 PM, Christophe Schmitz cofcof...@gmail.com wrote: Hi there, I am trying to run a 3 node spark cluster where each nodes contains

  1   2   3   4   >