I'm using python 3.6 and spark 2.4, I added the correct jars in spark, some
functions are being executed correctly, while others present this error as the
image below. Can anyone help?
ERROR EXECUTE POINTRDD:
When I try to execute this function from PointRDD conditioning these two
parameters
point_rdd = PointRDD(sc, input_location, splitter, carry_other_attributes,
level, s_epsg, t_epsg)
Traceback (most recent call last):
File "", line 1, in
File "/opt/anaconda3/lib/python3.6/site-packages/geospark/utils/meta.py", line
122, in call
return method(args, **kwargs)
File
"/opt/anaconda3/lib/python3.6/site-packages/geospark/core/SpatialRDD/point_rdd.py",
line 381, in init
targetEpsgCode
File "/opt/anaconda3/lib/python3.6/site-packages/py4j/java_gateway.py", line
1569, in call
answer, self._gateway_client, None, self._fqn)
File "/usr/local/spark/python/pyspark/sql/utils.py", line 63, in deco
return f(a, **kw)
File "/opt/anaconda3/lib/python3.6/site-packages/py4j/protocol.py", line 328,
in get_return_value
format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling
None.org.datasyslab.geospark.spatialRDD.PointRDD.
: java.net.ConnectException: Call From hadoopMaster/192.168.0.103 to
hadoopMaster:9000 failed on connection exception: java.net.ConnectException:
Connection refused; For more details see:
http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
at org.apache.hadoop.ipc.Client.call(Client.java:1479)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy26.getFileInfo(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy27.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2108)
at
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
at
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)
at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:57)
at org.apache.hadoop.fs.Globber.glob(Globber.java:252)
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1676)
at
org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:259)
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:204)
at org.apache.spark.rdd.RDD
KaTeX parse error: Can't use function '$' in math mode at position 8:
anonfun$̲partitions$2.ap…: anonfun$partitions$2.apply(RDD.scala:273)
at org.apache.spark.rdd.RDD
anonfun$partitions$2.apply(RDD.scala:269)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:269)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.s
ramonbarrosk @ramonbarrosk fev 11 11:15
These instructions work normally, but this PointRDD has this error
from pyspark import StorageLevel
from geospark.core.SpatialRDD import PointRDD
from geospark.core.enums import FileDataSplitter
input_location = "checkin.csv"
offset = 0 # The point long/lat starts from Column 0
splitter = FileDataSplitter.CSV # FileDataSplitter enumeration
carry_other_attributes = True # Carry Column 2 (hotel, gas, bar...)
level = StorageLevel.MEMORY_ONLY # Storage level from pyspark
s_epsg = "epsg:4326" # Source epsg code
t_epsg = "epsg:5070" # target epsg code
point_rdd = PointRDD(sc, input_location, offset, splitter,
carry_other_attributes)