I am working on the Databricks Reference applications, porting them to my
company's platform, and extending them to emit RDF. I have already gotten
them working with the extension on EC2, and have the Log Analyzer
application working on our platform. But the Twitter Language Classifier
application keeps getting an HTTP 401 Error. This error shows the following
message:
15/05/28 16:30:11 ERROR scheduler.ReceiverTracker: Deregistered receiver for
stream 0: Restarting receiver with delay 2000ms: Error receiving tweets -
401:Authentication credentials (https://dev.twitter.com/pages/auth) were
missing or incorrect. Ensure that you have set valid consumer key/secret,
access token/secret, and the system clock is in sync.
<html>\n<head>\n<meta http-equiv="Content-Type" content="text/html;
charset=utf-8"/>\n<title>Error 401 Unauthorized</title>
</head>
<body>
HTTP ERROR: 401

<p>Problem accessing '/1.1/statuses/sample.json?stall_warnings=true'.
Reason:
<pre>    Unauthorized</pre>
</body>
</html>

Relevant discussions can be found on the Internet at:
        http://www.google.co.jp/search?q=d0031b0b or
        http://www.google.co.jp/search?q=1db75513
TwitterException{exceptionCode=[d0031b0b-1db75513], statusCode=401,
message=null, code=-1, retryAfter=-1, rateLimitStatus=null, version=3.0.3}
        at 
twitter4j.internal.http.HttpClientImpl.request(HttpClientImpl.java:177)
        at
twitter4j.internal.http.HttpClientWrapper.request(HttpClientWrapper.java:61)
        at 
twitter4j.internal.http.HttpClientWrapper.get(HttpClientWrapper.java:89)
        at 
twitter4j.TwitterStreamImpl.getSampleStream(TwitterStreamImpl.java:176)
        at twitter4j.TwitterStreamImpl$4.getStream(TwitterStreamImpl.java:164)
        at
twitter4j.TwitterStreamImpl$TwitterStreamConsumer.run(TwitterStreamImpl.java:462)

15/05/28 16:30:13 INFO scheduler.ReceiverTracker: Registered receiver for
stream 0 from akka.tcp://sparkExecutor@mesos-0020-n3:56412

As the message says, the 401 code can be thrown by either bad credentials,
or by unsynchronized clocks. However, I know my credentials work, as I've
tested them both on EC2, and directly on the OAuth Testing Tool on the
Twitter Application Manager. Likewise, all the nodes on our cluster are
running ntpd, so that shouldn't be the problem either. I did some looking at
the API for the codes, and saw that the 401 could also be thrown for any
calls to the now deprecated v1 API endpoints, but given that the Databricks
applications run correctly in other contexts, such as EC2, I also do not
think this is the problem. I am at a loss to know why I am getting this
error, and have run out of ideas and could use some help.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Twitter-Streaming-HTTP-401-Error-tp23080.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to