[ 
https://issues.apache.org/jira/browse/BAHIR-186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16760364#comment-16760364
 ] 

ASF GitHub Bot commented on BAHIR-186:
--------------------------------------

ckadner commented on issue #75: [BAHIR-186] Reduce size of sql-cloudant test 
database
URL: https://github.com/apache/bahir/pull/75#issuecomment-460471642
 
 
   @emlaver -- your changes look good and help speeding up the test execution 👍 
   
   I do have an issue that may predate your changes though. After running `mvn 
clean install` and a few subsequent `mvn test -pl sql-cloudant` I keep running 
into the following test error in `CloudantChangesDFSuite`:
   
   ```
   - load data and verify deleted doc is not in results *** FAILED ***
     org.apache.bahir.cloudant.common.CloudantException: Error retrieving 
_changes feed data from database 'n_flight' with response code 401: 
{"error":"unauthorized","reason":"Name or password is incorrect."}
   ```
   
   When I run only the `CloudantChangesDFSuite` independently I do not get that 
error.
   ``` 
   $ mvn scalatest:test -pl sql-cloudant 
-Dsuites='org.apache.bahir.cloudant.CloudantChangesDFSuite'
   
   Run starting. Expected test count is: 9
   CloudantChangesDFSuite:
   
   Sql-cloudant tests that require Cloudant databases have been enabled by
   the environment variables CLOUDANT_USER and CLOUDANT_PASSWORD.
           
   - load and save data from Cloudant database
   - load and count data from Cloudant search index
   - load data and verify deleted doc is not in results
   - load data and count rows in filtered dataframe
   - save filtered dataframe to database
   - save dataframe to database using createDBOnSave=true option
   - load and count data from view
   - load data from view with MapReduce function
   - load data and verify total count of selector, filter, and view option
   Run completed in 2 minutes, 40 seconds.
   Total number of tests run: 9
   Suites: completed 1, aborted 0
   Tests: succeeded 9, failed 0, canceled 0, ignored 0, pending 0
   All tests passed.
   ```
   
   Maven log (console output) for all unit tests in SQL-Cloudant:
   ```
   $ mvn test -pl sql-cloudant
   
   ...
   [INFO] --- scalatest-maven-plugin:1.0:test (test) @ spark-sql-cloudant_2.11 
---
   Discovery starting.
   
   Sql-cloudant tests that require Cloudant databases have been enabled by
   the environment variables CLOUDANT_USER and CLOUDANT_PASSWORD.
    
   ...
   
   CloudantChangesDFSuite:
   - load and save data from Cloudant database
   - load and count data from Cloudant search index
   - load data and verify deleted doc is not in results *** FAILED ***
     org.apache.bahir.cloudant.common.CloudantException: Error retrieving 
_changes feed data from database 'n_flight' with response code 401: 
{"error":"unauthorized","reason":"Name or password is incorrect."}
     at org.apache.bahir.cloudant.DefaultSource.create(DefaultSource.scala:162)
     at 
org.apache.bahir.cloudant.DefaultSource.createRelation(DefaultSource.scala:95)
     at 
org.apache.bahir.cloudant.DefaultSource.createRelation(DefaultSource.scala:87)
     at 
org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:341)
     at 
org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
     at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
     at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:174)
     at 
org.apache.bahir.cloudant.CloudantChangesDFSuite$$anonfun$8.apply$mcV$sp(CloudantChangesDFSuite.scala:67)
     at 
org.apache.bahir.cloudant.CloudantChangesDFSuite$$anonfun$8.apply(CloudantChangesDFSuite.scala:62)
     at 
org.apache.bahir.cloudant.CloudantChangesDFSuite$$anonfun$8.apply(CloudantChangesDFSuite.scala:62)
     ...
   - load data and count rows in filtered dataframe
   - save filtered dataframe to database
   - save dataframe to database using createDBOnSave=true option
   - load and count data from view
   - load data from view with MapReduce function
   - load data and verify total count of selector, filter, and view option
   
   ...
   
   Run completed in 3 minutes, 34 seconds.
   Total number of tests run: 25
   Suites: completed 6, aborted 0
   Tests: succeeded 24, failed 1, canceled 0, ignored 0, pending 0
   *** 1 TEST FAILED ***
   [INFO] 
------------------------------------------------------------------------
   [INFO] BUILD FAILURE
   ```
   
   However, when I take a look at the `sql-cloudant/target/unit-tests.log` I 
don't find any errors logged during the execution of that test case:
   ```
   ===== TEST OUTPUT FOR org.apache.bahir.cloudant.CloudantChangesDFSuite: 
'load data and verify deleted doc is not in results' =====
   
   19/02/04 15:07:11.114 ScalaTest-main-running-CloudantChangesDFSuite INFO 
SharedState: loading hive config file: 
jar:file:/Users/ckadner/.m2/repository/org/apache/spark/spark-sql_2.11/2.3.2/spark-sql_2.11-2.3.2-tests.jar!/hive-site.xml
   19/02/04 15:07:11.130 ScalaTest-main-running-CloudantChangesDFSuite INFO 
SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of 
spark.sql.warehouse.dir 
('file:/Users/ckadner/Projects/bahir_apache_merges/sql-cloudant/spark-warehouse/').
   19/02/04 15:07:11.130 ScalaTest-main-running-CloudantChangesDFSuite INFO 
SharedState: Warehouse path is 
'file:/Users/ckadner/Projects/bahir_apache_merges/sql-cloudant/spark-warehouse/'.
   19/02/04 15:07:11.132 ScalaTest-main-running-CloudantChangesDFSuite INFO 
StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
   19/02/04 15:07:11.288 ScalaTest-main-running-CloudantChangesDFSuite INFO 
DefaultSource: Loading data from Cloudant using 
https://testy-spark-connector.cloudant.com/n_flight/_changes?include_docs=true&feed=normal&seq_interval=200&timeout=60000&filter=_selector
   19/02/04 15:07:11.292 streaming-start INFO ReceiverTracker: Starting 1 
receivers
   19/02/04 15:07:11.292 streaming-start INFO ReceiverTracker: ReceiverTracker 
started
   19/02/04 15:07:11.292 streaming-start INFO PluggableInputDStream: Slide time 
= 8000 ms
   19/02/04 15:07:11.293 streaming-start INFO PluggableInputDStream: Storage 
level = Memory Deserialized 1x Replicated
   19/02/04 15:07:11.293 streaming-start INFO PluggableInputDStream: Checkpoint 
interval = null
   19/02/04 15:07:11.293 streaming-start INFO PluggableInputDStream: Remember 
interval = 8000 ms
   19/02/04 15:07:11.293 streaming-start INFO PluggableInputDStream: 
Initialized and validated 
org.apache.spark.streaming.dstream.PluggableInputDStream@3fb41f46
   19/02/04 15:07:11.293 streaming-start INFO ForEachDStream: Slide time = 8000 
ms
   19/02/04 15:07:11.293 streaming-start INFO ForEachDStream: Storage level = 
Serialized 1x Replicated
   19/02/04 15:07:11.293 streaming-start INFO ForEachDStream: Checkpoint 
interval = null
   19/02/04 15:07:11.293 streaming-start INFO ForEachDStream: Remember interval 
= 8000 ms
   19/02/04 15:07:11.293 streaming-start INFO ForEachDStream: Initialized and 
validated org.apache.spark.streaming.dstream.ForEachDStream@14cb05dd
   19/02/04 15:07:11.293 streaming-start INFO RecurringTimer: Started timer for 
JobGenerator at time 1549321632000
   19/02/04 15:07:11.294 streaming-start INFO JobGenerator: Started 
JobGenerator at 1549321632000 ms
   19/02/04 15:07:11.294 streaming-start INFO JobScheduler: Started JobScheduler
   19/02/04 15:07:11.294 ScalaTest-main-running-CloudantChangesDFSuite INFO 
StreamingContext: StreamingContext started
   19/02/04 15:07:11.311 dag-scheduler-event-loop INFO DAGScheduler: Got job 0 
(start at DefaultSource.scala:155) with 1 output partitions
   19/02/04 15:07:11.311 dag-scheduler-event-loop INFO DAGScheduler: Final 
stage: ResultStage 0 (start at DefaultSource.scala:155)
   19/02/04 15:07:11.311 dag-scheduler-event-loop INFO DAGScheduler: Parents of 
final stage: List()
   19/02/04 15:07:11.311 dag-scheduler-event-loop INFO DAGScheduler: Missing 
parents: List()
   19/02/04 15:07:11.311 dispatcher-event-loop-2 INFO ReceiverTracker: Receiver 
0 started
   19/02/04 15:07:11.311 dag-scheduler-event-loop INFO DAGScheduler: Submitting 
ResultStage 0 (Receiver 0 ParallelCollectionRDD[1] at makeRDD at 
ReceiverTracker.scala:613), which has no missing parents
   19/02/04 15:07:11.327 dag-scheduler-event-loop INFO MemoryStore: Block 
broadcast_0 stored as values in memory (estimated size 62.3 KB, free 1638.5 MB)
   19/02/04 15:07:11.329 dag-scheduler-event-loop INFO MemoryStore: Block 
broadcast_0_piece0 stored as bytes in memory (estimated size 21.8 KB, free 
1638.5 MB)
   19/02/04 15:07:11.329 dispatcher-event-loop-3 INFO BlockManagerInfo: Added 
broadcast_0_piece0 in memory on ckadner-mbp.hsd1.ca.comcast.net:62424 (size: 
21.8 KB, free: 1638.6 MB)
   19/02/04 15:07:11.329 dag-scheduler-event-loop INFO SparkContext: Created 
broadcast 0 from broadcast at DAGScheduler.scala:1039
   19/02/04 15:07:11.330 dag-scheduler-event-loop INFO DAGScheduler: Submitting 
1 missing tasks from ResultStage 0 (Receiver 0 ParallelCollectionRDD[1] at 
makeRDD at ReceiverTracker.scala:613) (first 15 tasks are for partitions 
Vector(0))
   19/02/04 15:07:11.330 dag-scheduler-event-loop INFO TaskSchedulerImpl: 
Adding task set 0.0 with 1 tasks
   19/02/04 15:07:11.332 dispatcher-event-loop-0 INFO TaskSetManager: Starting 
task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, 
PROCESS_LOCAL, 9000 bytes)
   19/02/04 15:07:11.332 Executor task launch worker for task 0 INFO Executor: 
Running task 0.0 in stage 0.0 (TID 0)
   19/02/04 15:07:11.341 Executor task launch worker for task 0 INFO 
RecurringTimer: Started timer for BlockGenerator at time 1549321631400
   19/02/04 15:07:11.341 Executor task launch worker for task 0 INFO 
BlockGenerator: Started BlockGenerator
   19/02/04 15:07:11.342 Thread-35 INFO BlockGenerator: Started block pushing 
thread
   19/02/04 15:07:11.342 dispatcher-event-loop-0 INFO ReceiverTracker: 
Registered receiver for stream 0 from ckadner-mbp.hsd1.ca.comcast.net:62423
   19/02/04 15:07:11.342 Executor task launch worker for task 0 INFO 
ReceiverSupervisorImpl: Starting receiver 0
   19/02/04 15:07:11.342 Executor task launch worker for task 0 INFO 
ReceiverSupervisorImpl: Called receiver 0 onStart
   19/02/04 15:07:11.342 Executor task launch worker for task 0 INFO 
ReceiverSupervisorImpl: Waiting for receiver to be stopped
   19/02/04 15:07:12.003 JobGenerator INFO JobScheduler: Added jobs for time 
1549321632000 ms
   19/02/04 15:07:12.003 JobScheduler INFO JobScheduler: Starting job streaming 
job 1549321632000 ms.0 from job set of time 1549321632000 ms
   19/02/04 15:07:12.051 streaming-job-executor-0 INFO SparkContext: Starting 
job: json at DefaultSource.scala:150
   19/02/04 15:07:12.052 streaming-job-executor-0 INFO DAGScheduler: Job 1 
finished: json at DefaultSource.scala:150, took 0.000025 s
   19/02/04 15:07:12.063 dispatcher-event-loop-1 INFO ReceiverTracker: Sent 
stop signal to all 1 receivers
   19/02/04 15:07:12.063 dispatcher-event-loop-2 INFO ReceiverSupervisorImpl: 
Received stop signal
   19/02/04 15:07:12.063 dispatcher-event-loop-2 INFO ReceiverSupervisorImpl: 
Stopping receiver with message: Stopped by driver: 
   19/02/04 15:07:12.063 dispatcher-event-loop-2 INFO ReceiverSupervisorImpl: 
Called receiver onStop
   19/02/04 15:07:12.063 dispatcher-event-loop-2 INFO ReceiverSupervisorImpl: 
Deregistering receiver 0
   19/02/04 15:07:12.064 dispatcher-event-loop-3 ERROR ReceiverTracker: 
Deregistered receiver for stream 0: Stopped by driver
   19/02/04 15:07:12.064 dispatcher-event-loop-2 INFO ReceiverSupervisorImpl: 
Stopped receiver 0
   19/02/04 15:07:12.064 dispatcher-event-loop-2 INFO BlockGenerator: Stopping 
BlockGenerator
   19/02/04 15:07:12.402 dispatcher-event-loop-2 INFO RecurringTimer: Stopped 
timer for BlockGenerator after time 1549321632400
   19/02/04 15:07:12.402 dispatcher-event-loop-2 INFO BlockGenerator: Waiting 
for block pushing thread to terminate
   19/02/04 15:07:12.406 Thread-35 INFO BlockGenerator: Pushing out the last 0 
blocks
   19/02/04 15:07:12.406 Thread-35 INFO BlockGenerator: Stopped block pushing 
thread
   19/02/04 15:07:12.406 dispatcher-event-loop-2 INFO BlockGenerator: Stopped 
BlockGenerator
   19/02/04 15:07:12.406 Executor task launch worker for task 0 INFO 
ReceiverSupervisorImpl: Stopped receiver without error
   19/02/04 15:07:12.407 Executor task launch worker for task 0 INFO Executor: 
Finished task 0.0 in stage 0.0 (TID 0). 751 bytes result sent to driver
   19/02/04 15:07:12.407 task-result-getter-0 INFO TaskSetManager: Finished 
task 0.0 in stage 0.0 (TID 0) in 1076 ms on localhost (executor driver) (1/1)
   19/02/04 15:07:12.407 task-result-getter-0 INFO TaskSchedulerImpl: Removed 
TaskSet 0.0, whose tasks have all completed, from pool 
   19/02/04 15:07:12.408 dag-scheduler-event-loop INFO DAGScheduler: 
ResultStage 0 (start at DefaultSource.scala:155) finished in 1.096 s
   19/02/04 15:07:12.408 streaming-job-executor-0 INFO ReceiverTracker: All of 
the receivers have deregistered successfully
   19/02/04 15:07:12.408 streaming-job-executor-0 INFO ReceiverTracker: 
ReceiverTracker stopped
   19/02/04 15:07:12.408 streaming-job-executor-0 INFO JobGenerator: Stopping 
JobGenerator immediately
   19/02/04 15:07:12.409 streaming-job-executor-0 INFO RecurringTimer: Stopped 
timer for JobGenerator after time 1549321632000
   19/02/04 15:07:12.409 streaming-job-executor-0 INFO JobGenerator: Stopped 
JobGenerator
   19/02/04 15:07:12.687 Spark Context Cleaner INFO ContextCleaner: Cleaned 
accumulator 470
   <<19/02/04 15:07:12. ... Spark Context Cleaner INFO ContextCleaner: Cleaned 
accumulator ...>>
   19/02/04 15:07:12.688 Spark Context Cleaner INFO ContextCleaner: Cleaned 
accumulator 479
   19/02/04 15:07:12.690 dispatcher-event-loop-3 INFO BlockManagerInfo: Removed 
broadcast_0_piece0 on ckadner-mbp.hsd1.ca.comcast.net:62424 in memory (size: 
21.8 KB, free: 1638.6 MB)
   19/02/04 15:07:12.693 Spark Context Cleaner INFO ContextCleaner: Cleaned 
accumulator 467
   <<19/02/04 15:07:12. ... Spark Context Cleaner INFO ContextCleaner: Cleaned 
accumulator ...>>
   19/02/04 15:07:12.693 Spark Context Cleaner INFO ContextCleaner: Cleaned 
accumulator 486
   19/02/04 15:07:14.412 streaming-job-executor-0 INFO JobScheduler: Stopped 
JobScheduler
   19/02/04 15:07:14.413 streaming-job-executor-0 INFO StreamingContext: 
StreamingContext stopped successfully
   19/02/04 15:07:14.415 ScalaTest-main-running-CloudantChangesDFSuite INFO 
CloudantChangesDFSuite: 
   
   ===== FINISHED org.apache.bahir.cloudant.CloudantChangesDFSuite: 'load data 
and verify deleted doc is not in results' =====
   ```
   
   I do however find that error message earlier in the log during the test case 
`incorrect password throws an error message for changes receiver` where it is 
expected:
   ```
   ===== TEST OUTPUT FOR org.apache.bahir.cloudant.CloudantOptionSuite: 
'incorrect password throws an error message for changes receiver' =====
   ...
   19/02/04 15:06:32.751 Cloudant Receiver ERROR CookieInterceptor: Credentials 
are incorrect for server 
https://testy-spark-connector.cloudant.com:443/_session, cookie authentication 
will not be attempted again by this interceptor object
   19/02/04 15:06:33.909 Cloudant Receiver WARN ReceiverSupervisorImpl: 
Reported error Error retrieving _changes feed data from database 'n_flight' 
with response code 401: {"error":"unauthorized","reason":"Name or password is 
incorrect."} - org.apache.bahir.cloudant.common.CloudantException: Error 
retrieving _changes feed data from database 'n_flight' with response code 401: 
{"error":"unauthorized","reason":"Name or password is incorrect."}
   19/02/04 15:06:33.912 dispatcher-event-loop-1 WARN ReceiverTracker: Error 
reported by receiver for stream 0: Error retrieving _changes feed data from 
database 'n_flight' with response code 401: 
{"error":"unauthorized","reason":"Name or password is incorrect."} - 
org.apache.bahir.cloudant.common.CloudantException: Error retrieving _changes 
feed data from database 'n_flight' with response code 401: 
{"error":"unauthorized","reason":"Name or password is incorrect."}
        at 
org.apache.bahir.cloudant.internal.ChangesReceiver.org$apache$bahir$cloudant$internal$ChangesReceiver$$receive(ChangesReceiver.scala:79)
        at 
org.apache.bahir.cloudant.internal.ChangesReceiver$$anon$1.run(ChangesReceiver.scala:37)
   
   ...
   19/02/04 15:06:44.445 ScalaTest-main-running-CloudantOptionSuite INFO 
CloudantOptionSuite: 
   
   ===== FINISHED org.apache.bahir.cloudant.CloudantOptionSuite: 'incorrect 
password throws an error message for changes receiver' =====
   ```
   
   
   When I look at the sources here 
https://github.com/apache/bahir/blob/c51853d135ad2d9da67804259f4ed0e29223afb3/sql-cloudant/src/main/scala/org/apache/bahir/cloudant/DefaultSource.scala#L159-L163
   
   ...it appears that this error is a fluke? The test case `load data and 
verify deleted doc is not in results` expects the schema to be empty but the 
`DefaultSource` treats that as an error and throws up the last 
`receiverErrorMsg` which in this case comes from the earlier run test case 
`incorrect password throws an error message for changes receiver`. 
   
   At least I thought that at first sight, but the test case does not seem to 
expect an empty schema 
   
https://github.com/apache/bahir/blob/c51853d135ad2d9da67804259f4ed0e29223afb3/sql-cloudant/src/test/scala/org/apache/bahir/cloudant/CloudantChangesDFSuite.scala#L67-L69
   
   So I am not sure what is happening. But you might? :-)
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support SSL connection in MQTT SQL Streaming
> --------------------------------------------
>
>                 Key: BAHIR-186
>                 URL: https://issues.apache.org/jira/browse/BAHIR-186
>             Project: Bahir
>          Issue Type: New Feature
>          Components: Spark Streaming Connectors
>            Reporter: Lukasz Antoniak
>            Assignee: Lukasz Antoniak
>            Priority: Major
>             Fix For: Spark-2.4.0
>
>
> Mailing list discussion: 
> https://www.mail-archive.com/user@bahir.apache.org/msg00022.html.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to