Hi everyone, Recently I discovered an issue when processing csv of spark. So I decided to fix it following this https://issues.apache.org/jira/browse/SPARK-21024 I built a custom distribution for internal uses. I built it in my local machine then upload the distribution to server.
server's *~/.bashrc* # added by Anaconda2 4.3.1 installer export PATH="/opt/etl/anaconda/anaconda2/bin:$PATH" export SPARK_HOME="/opt/etl/spark-2.1.0-bin-hadoop2.7" export PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.10.4-src.zip:$PYTHONPATH What I did on server was: export SPARK_HOME=/home/etladmin/spark-2.2.1-SNAPSHOT-bin-custom $SPARK_HOME/bin/spark-submit --version It print out version *2.1.1* which* is not* the version I built (2.2.1) I did set *SPARK_HOME* in my local machine (MACOS) for this distribution and it's working well, print out the version *2.2.1* I need the way to investigate the invisible environment variable. Do you have any suggestions? Thank in advance. Regards, Chanh -- Regards, Chanh