Jorge Pizarro created SPARK-22177:
-------------------------------------

             Summary: Error running ml_ops.sh(SPOT): Can not create a Path from 
an empty string
                 Key: SPARK-22177
                 URL: https://issues.apache.org/jira/browse/SPARK-22177
             Project: Spark
          Issue Type: Question
          Components: ML, Spark Submit, YARN
    Affects Versions: 2.2.0
         Environment: CentOS 7 1708
Hadoop 2.6.0
Scala 2.11.8
SPOT 1.0
            Reporter: Jorge Pizarro
            Priority: Minor


Error message running "./ml_ops.sh 20170922 dns 1e-4".
Complete error message:

[soluser@master spot-ml]$ bash -x ./ml_ops.sh 20170922 dns 1e-4
+ FDATE=20170922
+ DSOURCE=dns
+ YR=2017
+ MH=09
+ DY=22
+ [[ 8 != \8 ]]
+ [[ -z dns ]]
+ source /etc/spot.conf
++ UINODE=master
++ MLNODE=master
++ GWNODE=master
++ DBNAME=spotdb
++ HUSER=/user/soluser
++ NAME_NODE=master
++ WEB_PORT=50070
++ DNS_PATH=/user/soluser/dns/hive/y=2017/m=09/d=22/
++ PROXY_PATH=/user/soluser/dns/hive/y=2017/m=09/d=22/
++ FLOW_PATH=/user/soluser/dns/hive/y=2017/m=09/d=22/
++ HPATH=/user/soluser/dns/scored_results/20170922
++ IMPALA_DEM=master
++ IMPALA_PORT=21050
++ LUSER=/home/soluser
++ LPATH=/home/soluser/ml/dns/20170922
++ RPATH=/home/soluser/ipython/user/20170922
++ LIPATH=/home/soluser/ingest
++ USER_DOMAIN=neosecure
++ SPK_EXEC=1
++ SPK_EXEC_MEM=1g
++ SPK_DRIVER_MEM=1g
++ SPK_DRIVER_MAX_RESULTS=200m
++ SPK_EXEC_CORES=2
++ SPK_DRIVER_MEM_OVERHEAD=100m
++ SPK_EXEC_MEM_OVERHEAD=100m
++ SPK_AUTO_BRDCST_JOIN_THR=10485760
++ LDA_OPTIMIZER=em
++ LDA_ALPHA=1.02
++ LDA_BETA=1.001
++ PRECISION=64
++ TOL=1e-6
++ TOPIC_COUNT=20
++ DUPFACTOR=1000
+ '[' -n 1e-4 ']'
+ TOL=1e-4
+ '[' -n '' ']'
+ MAXRESULTS=-1
+ '[' dns == flow ']'
+ '[' dns == dns ']'
+ RAWDATA_PATH=/user/soluser/dns/hive/y=2017/m=09/d=22/
+ '[' '!' -z neosecure ']'
+ USER_DOMAIN_CMD='--userdomain neosecure'
+ 
FEEDBACK_PATH=/user/soluser/dns/scored_results/20170922/feedback/ml_feedback.csv
+ HDFS_SCORED_CONNECTS=/user/soluser/dns/scored_results/20170922/scores
+ hdfs dfs -rm -R -f /user/soluser/dns/scored_results/20170922/scores
+ spark-submit --class org.apache.spot.SuspiciousConnects --master yarn 
--deploy-mode cluster --driver-memory 1g --conf spark.driver.maxResultSize=200m 
--conf spark.driver.maxPermSize=512m --conf 
spark.dynamicAllocation.enabled=true --conf 
spark.dynamicAllocation.maxExecutors=1 --conf spark.executor.cores=2 --conf 
spark.executor.memory=1g --conf spark.sql.autoBroadcastJoinThreshold=10485760 
--conf 'spark.executor.extraJavaOptions=-XX:MaxPermSize=512M -XX:PermSize=512M' 
--conf spark.kryoserializer.buffer.max=512m --conf spark.yarn.am.waitTime=100s 
--conf spark.yarn.am.memoryOverhead=100m --conf 
spark.yarn.executor.memoryOverhead=100m 
target/scala-2.11/spot-ml-assembly-1.1.jar --analysis dns --input 
/user/soluser/dns/hive/y=2017/m=09/d=22/ --dupfactor 1000 --feedback 
/user/soluser/dns/scored_results/20170922/feedback/ml_feedback.csv 
--ldatopiccount 20 --scored /user/soluser/dns/scored_results/20170922/scores 
--threshold 1e-4 --maxresults -1 --ldamaxiterations 20 --ldaalpha 1.02 
--ldabeta 1.001 --ldaoptimizer em --precision 64 --userdomain neosecure
17/09/29 13:51:56 INFO client.RMProxy: Connecting to ResourceManager at 
/0.0.0.0:8032
17/09/29 13:51:56 INFO yarn.Client: Requesting a new application from cluster 
with 0 NodeManagers
17/09/29 13:51:56 INFO yarn.Client: Verifying our application has not requested 
more than the maximum memory capability of the cluster (8192 MB per container)
17/09/29 13:51:56 INFO yarn.Client: Will allocate AM container, with 1408 MB 
memory including 384 MB overhead
17/09/29 13:51:56 INFO yarn.Client: Setting up container launch context for our 
AM
17/09/29 13:51:56 INFO yarn.Client: Setting up the launch environment for our 
AM container
17/09/29 13:51:56 INFO yarn.Client: Preparing resources for our AM container
17/09/29 13:51:57 INFO yarn.Client: Deleted staging directory 
hdfs://master:9000/user/soluser/.sparkStaging/application_1506636890912_0058
Exception in thread "main" java.lang.IllegalArgumentException: Can not create a 
Path from an empty string
        at org.apache.hadoop.fs.Path.checkPathArg(Path.java:126)
        at org.apache.hadoop.fs.Path.<init>(Path.java:134)
        at org.apache.hadoop.fs.Path.<init>(Path.java:93)
        at 
org.apache.spark.deploy.yarn.Client.copyFileToRemote(Client.scala:337)
        at 
org.apache.spark.deploy.yarn.Client.org$apache$spark$deploy$yarn$Client$$distribute$1(Client.scala:458)
        at 
org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:497)
        at 
org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:814)
        at 
org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:169)
        at org.apache.spark.deploy.yarn.Client.run(Client.scala:1091)
        at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1150)
        at org.apache.spark.deploy.yarn.Client.main(Client.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
        at 
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

real    0m3.610s
user    0m5.122s
sys     0m0.369s
[soluser@master spot-ml]$ cat /etc/spot.conf 

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at

#    http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


#node configuration
UINODE='master'
MLNODE='master'
GWNODE='master'
DBNAME='spotdb'

#hdfs - base user and data source config
HUSER='/user/soluser'
NAME_NODE='master'
WEB_PORT=50070
DNS_PATH=${HUSER}/${DSOURCE}/hive/y=${YR}/m=${MH}/d=${DY}/
PROXY_PATH=${HUSER}/${DSOURCE}/hive/y=${YR}/m=${MH}/d=${DY}/
FLOW_PATH=${HUSER}/${DSOURCE}/hive/y=${YR}/m=${MH}/d=${DY}/
HPATH=${HUSER}/${DSOURCE}/scored_results/${FDATE}

#impala config
IMPALA_DEM=master
IMPALA_PORT=21050

#local fs base user and data source config
LUSER='/home/soluser'
LPATH=${LUSER}/ml/${DSOURCE}/${FDATE}
RPATH=${LUSER}/ipython/user/${FDATE}
LIPATH=${LUSER}/ingest

#dns suspicious connects config
USER_DOMAIN='neosecure'

SPK_EXEC='1'
SPK_EXEC_MEM='1g'
SPK_DRIVER_MEM='1g'
SPK_DRIVER_MAX_RESULTS='200m'
SPK_EXEC_CORES='2'
SPK_DRIVER_MEM_OVERHEAD='100m'
SPK_EXEC_MEM_OVERHEAD='100m'
SPK_AUTO_BRDCST_JOIN_THR='10485760'

LDA_OPTIMIZER='em'
LDA_ALPHA='1.02'
LDA_BETA='1.001'

PRECISION='64'
TOL='1e-6'
TOPIC_COUNT=20
DUPFACTOR=1000

[soluser@master spot-ml]$ spark-shell
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
setLogLevel(newLevel).
17/09/29 13:52:57 WARN metastore.ObjectStore: Failed to get database 
global_temp, returning NoSuchObjectException
Spark context Web UI available at http://192.168.40.158:4040
Spark context available as 'sc' (master = spark://master:7077, app id = 
app-20170929135251-0000).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.2.0
      /_/
         
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_144)
Type in expressions to have them evaluated.
Type :help for more information.

scala> :quit

[soluser@master spot-ml]$ java -version
java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)

[soluser@master spot-ml]$ hdfs version
Hadoop 2.6.0-cdh5.12.1
Subversion http://github.com/cloudera/hadoop -r 
520d8b072e666e9f21d645ca6a5219fc37535a52
Compiled by jenkins on 2017-08-24T16:34Z
Compiled with protoc 2.5.0
>From source with checksum de51bf9693ab9426379a1cd28142cea0
This command was run using /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.12.1.jar
[soluser@master spot-ml]$ 

Thanks in advance
Jorge Pizarro



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to