Re: Spark SQL -- more than two tables for join

2014-10-07 Thread TANG Gen
Hi, the same problem happens when I try several joins together, such as
'SELECT * FROM sales INNER JOIN magasin ON sales.STO_KEY = magasin.STO_KEY
INNER JOIN eans ON (sales.BARC_KEY = eans.BARC_KEY and magasin.FORM_KEY =
eans.FORM_KEY)'

The error information is as follow: 
py4j.protocol.Py4JJavaError: An error occurred while calling o1229.sql.
: java.lang.RuntimeException: [1.269] failure: ``UNION'' expected but
`INNER' fo  
  
und

SELECT sales.Date AS Date, sales.ID_FOYER AS ID_FOYER, Sales.STO_KEY AS
STO_KEY,
 
sales.Quantite AS Quantite, sales.Prix AS Prix, sales.Total AS Total,
magasin.F   
 
ORM_KEY AS FORM_KEY, eans.UB_KEY AS UB_KEY FROM sales INNER JOIN magasin ON
sale

s.STO_KEY = magasin.STO_KEY INNER JOIN eans ON (sales.BARC_KEY =
eans.BARC_KEY a 
   
nd magasin.FORM_KEY = eans.FORM_KEY)







  
^
at scala.sys.package$.error(package.scala:27)
at org.apache.spark.sql.catalyst.SqlParser.apply(SqlParser.scala:60)
at org.apache.spark.sql.SQLContext.parseSql(SQLContext.scala:73)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:260)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.   

 
java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces   

 
sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
at
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:259)
at
py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:745)


I have an impression that sparksql doesn't support more than two joins



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-more-than-two-tables-for-join-tp13865p15847.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



The question about mount ephemeral disk in slave-setup.sh

2014-10-03 Thread TANG Gen
Hi,

I am quite a new user of spark, and I have a stupid question about mount
ephemeral disk for AWS EC2.

If I well understand the spark_ec.py script, it is spark-ec2/setup-slave.sh
that mounts the ephemeral disk for AWS EC2(Instance Store Volumes). However,
in setup-slave.sh, it seems that these disks are only mounted if the
instance begins with r3. 
For other instance types, are their ephemeral disk mounted or not? If yes,
which script mounts them or they are mounted automatically by AWS?

Thanks a lot for your help in advance. 

Best regards
Gen




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/The-question-about-mount-ephemeral-disk-in-slave-setup-sh-tp15675.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: The question about mount ephemeral disk in slave-setup.sh

2014-10-03 Thread TANG Gen
I have taken a look at the code of mesos spark-ec2 and documentation of AWS.
I think that maybe I found the answer. 

In fact, there are two types AMI in AWS EBS backed AMI and instance store
backed AMI. For EBS backed AMI, we can add instance store volume when we
create the images(The details can be founded in 
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/creating-an-ami-ebs.html
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/creating-an-ami-ebs.html 
). And then by default when we launch an instance from this AMI, the default
instance store volume will be formatted(ext3) and mounted at
/media/ephemeral0... etc 

The images provided by mesos spark-ec2 is EBS backed AMI and it is already
added instance store volume(I guess). However, it is modified the file
etc/fstab to mount the ephemeral disks to /mnt...etc (But I don't know how
they modify dynamically the file etc/fstab)

At last, as described in slave-setup.sh, for r3*, ext4 has the best
performance. Hence, they reformat the ephemeral disk to ext4 and mount it to
/mnt...etc.

Hope this could help someone else. 



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/The-question-about-mount-ephemeral-disk-in-slave-setup-sh-tp15675p15704.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Spark Monitoring with Ganglia

2014-10-03 Thread TANG Gen
Maybe you can follow the instruction in this link 
https://github.com/mesos/spark-ec2/tree/v3/ganglia
https://github.com/mesos/spark-ec2/tree/v3/ganglia  . For me it works well



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Monitoring-with-Ganglia-tp15538p15705.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org