Re: Spark SQL -- more than two tables for join
Hi, the same problem happens when I try several joins together, such as 'SELECT * FROM sales INNER JOIN magasin ON sales.STO_KEY = magasin.STO_KEY INNER JOIN eans ON (sales.BARC_KEY = eans.BARC_KEY and magasin.FORM_KEY = eans.FORM_KEY)' The error information is as follow: py4j.protocol.Py4JJavaError: An error occurred while calling o1229.sql. : java.lang.RuntimeException: [1.269] failure: ``UNION'' expected but `INNER' fo und SELECT sales.Date AS Date, sales.ID_FOYER AS ID_FOYER, Sales.STO_KEY AS STO_KEY, sales.Quantite AS Quantite, sales.Prix AS Prix, sales.Total AS Total, magasin.F ORM_KEY AS FORM_KEY, eans.UB_KEY AS UB_KEY FROM sales INNER JOIN magasin ON sale s.STO_KEY = magasin.STO_KEY INNER JOIN eans ON (sales.BARC_KEY = eans.BARC_KEY a nd magasin.FORM_KEY = eans.FORM_KEY) ^ at scala.sys.package$.error(package.scala:27) at org.apache.spark.sql.catalyst.SqlParser.apply(SqlParser.scala:60) at org.apache.spark.sql.SQLContext.parseSql(SQLContext.scala:73) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:260) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces sorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) at py4j.Gateway.invoke(Gateway.java:259) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:207) at java.lang.Thread.run(Thread.java:745) I have an impression that sparksql doesn't support more than two joins -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-more-than-two-tables-for-join-tp13865p15847.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
The question about mount ephemeral disk in slave-setup.sh
Hi, I am quite a new user of spark, and I have a stupid question about mount ephemeral disk for AWS EC2. If I well understand the spark_ec.py script, it is spark-ec2/setup-slave.sh that mounts the ephemeral disk for AWS EC2(Instance Store Volumes). However, in setup-slave.sh, it seems that these disks are only mounted if the instance begins with r3. For other instance types, are their ephemeral disk mounted or not? If yes, which script mounts them or they are mounted automatically by AWS? Thanks a lot for your help in advance. Best regards Gen -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/The-question-about-mount-ephemeral-disk-in-slave-setup-sh-tp15675.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: The question about mount ephemeral disk in slave-setup.sh
I have taken a look at the code of mesos spark-ec2 and documentation of AWS. I think that maybe I found the answer. In fact, there are two types AMI in AWS EBS backed AMI and instance store backed AMI. For EBS backed AMI, we can add instance store volume when we create the images(The details can be founded in http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/creating-an-ami-ebs.html http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/creating-an-ami-ebs.html ). And then by default when we launch an instance from this AMI, the default instance store volume will be formatted(ext3) and mounted at /media/ephemeral0... etc The images provided by mesos spark-ec2 is EBS backed AMI and it is already added instance store volume(I guess). However, it is modified the file etc/fstab to mount the ephemeral disks to /mnt...etc (But I don't know how they modify dynamically the file etc/fstab) At last, as described in slave-setup.sh, for r3*, ext4 has the best performance. Hence, they reformat the ephemeral disk to ext4 and mount it to /mnt...etc. Hope this could help someone else. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/The-question-about-mount-ephemeral-disk-in-slave-setup-sh-tp15675p15704.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark Monitoring with Ganglia
Maybe you can follow the instruction in this link https://github.com/mesos/spark-ec2/tree/v3/ganglia https://github.com/mesos/spark-ec2/tree/v3/ganglia . For me it works well -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Monitoring-with-Ganglia-tp15538p15705.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org