Alan Conway escribió:
On Thu, 2008-06-26 at 12:12 +0200, Manuel Teira wrote:
Hello.
After further  investigation and tests, related with the change in
r671604 to drop the file locking strategy in favour of a flock on the
data dir.

Trying to write a similar code, but using lockf, I hit the issue that
the file must be opened using O_RDWR or O_RWONLY, and that's not allowed
for a directory.
The same happens trying to use a fcntl call.
And unexpectedly, the same for flock. In the solaris manual page:

<snip>
     Read permission is required on a file  to  obtain  a  shared
     lock,   and  write  permission  is  required  to  obtain  an
     exclusive lock.
</snip>

But the linux man page claims:

<snip>
A shared or exclusive lock can be placed on a file regardless of the
mode in which the file was opened.
</snip>

I've searched the web for some BSD system pages, but they don't say
anything about the file mode.


On the other way, POSIX fcntl specification says, apropos the failure
causes:

[EBADF]
    The /fildes/ argument is not a valid open file descriptor, or the
    argument /cmd/ is F_SETLK or F_SETLKW, the type of lock, *l_type*,
    is a shared lock (F_RDLCK), and /fildes/ is not a valid file
    descriptor open for reading, or the type of lock *l_type*, is an
    exclusive lock (F_WRLCK), and /fildes/ is not a valid file
    descriptor open for writing.

Posix specs also forces write permissions for lockf:
http://www.opengroup.org/onlinepubs/007908799/xsh/lockf.html



This leads to solaris not being able to lock directly on a directory,
I'm afraid. Any idea?


Yes, we can create (if it doesn't already exist) a lock file in the
directory and then use lockf to lock it. There's already code in
Daemon.cpp that does exactly this for the PID file. The reason I
switched to flock was because crashing or killed brokers were sometimes
leaving the lock file behind them, whereas a flock (or lockf)  lock is
automatically released when the process exits.

We need to
 - create a qpid::sys::LockFile class that can be re-implemented on
different platforms.
 - use the Daemon.cpp code as the posix implementation.
 - Replace the locking code in Daemon.cpp and DataDir.cpp with the
common sys::LockFile.

It's JIRA https://issues.apache.org/jira/browse/QPID-1158
Could you take this on Manuel? I'll can do it but it may take a couple
days to get to it.
Of course, I will try (will try to start on monday). By the moment I've reverted changes to keep using the old DataDir.cpp code. I was able to pass most of the tests on solaris (more changes about bashisms needed, though), I will have to take a look about some random message, but this is a dump of a 'make check' session now:

-bash-3.00$ make check
make libshlibtest.la libdlclose_noop.la unit_test perftest txtest latencytest client_test topic_listener topic_publisher publish consume
`libshlibtest.la' is up to date.
`libdlclose_noop.la' is up to date.
`unit_test' is up to date.
`perftest' is up to date.
`txtest' is up to date.
`latencytest' is up to date.
`client_test' is up to date.
`topic_listener' is up to date.
`topic_publisher' is up to date.
`publish' is up to date.
`consume' is up to date.
make  check-TESTS
Running 154 test cases...
2008-jun-27 17:09:18 error Exception in client dispatch thread: Connection closed by broker

*** No errors detected
PASS: unit_test
PASS: start_broker
PASS: client_test
SubscribeThread exception: Sequence error: expected n==1 but got 0 (perftest.cpp:524)
FAIL: quick_perftest
PASS: quick_topictest
sh: objdump: not found
test_example (tests_0-10.example.ExampleTest) ... ok
test_auto_rollback (tests_0-10.tx.TxTests) ... ok
test_commit (tests_0-10.tx.TxTests) ... ok
test_rollback (tests_0-10.tx.TxTests) ... ok
test_broker_connectivity (tests_0-10.management.ManagementTest) ... ok
test_self_session_id (tests_0-10.management.ManagementTest) ... ok
test_standard_exchanges (tests_0-10.management.ManagementTest) ... ok
test_system_object (tests_0-10.management.ManagementTest) ... ok
test_bad_resume (tests_0-10.dtx.DtxTests) ... ok
test_commit_unknown (tests_0-10.dtx.DtxTests) ... ok
test_end (tests_0-10.dtx.DtxTests) ... ok
test_end_suspend_and_fail (tests_0-10.dtx.DtxTests) ... ok
test_end_unknown_xid (tests_0-10.dtx.DtxTests) ... ok
test_forget_xid_on_completion (tests_0-10.dtx.DtxTests) ... ok
test_get_timeout (tests_0-10.dtx.DtxTests) ... ok
test_get_timeout_unknown (tests_0-10.dtx.DtxTests) ... ok
test_implicit_end (tests_0-10.dtx.DtxTests) ... ok
test_invalid_commit_not_ended (tests_0-10.dtx.DtxTests) ... ok
test_invalid_commit_one_phase_false (tests_0-10.dtx.DtxTests) ... ok
test_invalid_commit_one_phase_true (tests_0-10.dtx.DtxTests) ... ok
test_invalid_prepare_not_ended (tests_0-10.dtx.DtxTests) ... ok
test_invalid_rollback_not_ended (tests_0-10.dtx.DtxTests) ... ok
test_prepare_unknown (tests_0-10.dtx.DtxTests) ... ok
test_recover (tests_0-10.dtx.DtxTests) ... ok
test_rollback_unknown (tests_0-10.dtx.DtxTests) ... ok
test_select_required (tests_0-10.dtx.DtxTests) ... ok
test_set_timeout (tests_0-10.dtx.DtxTests) ... ok
test_simple_commit (tests_0-10.dtx.DtxTests) ... ok
test_simple_prepare_commit (tests_0-10.dtx.DtxTests) ... ok
test_simple_prepare_rollback (tests_0-10.dtx.DtxTests) ... ok
test_simple_rollback (tests_0-10.dtx.DtxTests) ... ok
test_start_already_known (tests_0-10.dtx.DtxTests) ... ok
test_start_join (tests_0-10.dtx.DtxTests) ... ok
test_start_join_and_resume (tests_0-10.dtx.DtxTests) ... ok
test_suspend_resume (tests_0-10.dtx.DtxTests) ... ok
test_suspend_start_end_resume (tests_0-10.dtx.DtxTests) ... ok
test_delete_while_used_by_exchange (tests_0-10.alternate_exchange.AlternateExchangeTests) ... ok test_delete_while_used_by_queue (tests_0-10.alternate_exchange.AlternateExchangeTests) ... ok test_queue_delete (tests_0-10.alternate_exchange.AlternateExchangeTests) ... ok test_unroutable (tests_0-10.alternate_exchange.AlternateExchangeTests) ... ok
test (tests_0-10.exchange.DeclareMethodPassiveFieldNotFoundRuleTests) ... ok
testDefaultExchange (tests_0-10.exchange.DefaultExchangeRuleTests) ... ok
testHeadersBindNoMatchArg (tests_0-10.exchange.ExchangeTests) ... ok
testMatchAll (tests_0-10.exchange.HeadersExchangeTests) ... ok
testMatchAny (tests_0-10.exchange.HeadersExchangeTests) ... ok
testDifferentDeclaredType (tests_0-10.exchange.MiscellaneousErrorsTests) ... ok
testTypeNotKnown (tests_0-10.exchange.MiscellaneousErrorsTests) ... ok
testDirect (tests_0-10.exchange.RecommendedTypesRuleTests) ... ok
testFanout (tests_0-10.exchange.RecommendedTypesRuleTests) ... ok
testHeaders (tests_0-10.exchange.RecommendedTypesRuleTests) ... ok
testTopic (tests_0-10.exchange.RecommendedTypesRuleTests) ... ok
testAmqDirect (tests_0-10.exchange.RequiredInstancesRuleTests) ... ok
testAmqFanOut (tests_0-10.exchange.RequiredInstancesRuleTests) ... ok
testAmqMatch (tests_0-10.exchange.RequiredInstancesRuleTests) ... ok
testAmqTopic (tests_0-10.exchange.RequiredInstancesRuleTests) ... ok
test_ack_and_no_ack (tests_0-10.broker.BrokerTests) ... ok
test_simple_delivery_immediate (tests_0-10.broker.BrokerTests) ... ok
test_simple_delivery_queued (tests_0-10.broker.BrokerTests) ... ok
test_ack (tests_0-10.message.MessageTests) ... ok
test_acquire (tests_0-10.message.MessageTests) ... ok
test_acquire_with_no_accept_and_credit_flow (tests_0-10.message.MessageTests) ... ok
test_cancel (tests_0-10.message.MessageTests) ... ok
test_consume_exclusive (tests_0-10.message.MessageTests) ... ok
test_consume_exclusive2 (tests_0-10.message.MessageTests) ... ok
test_consume_queue_not_found (tests_0-10.message.MessageTests) ... ok
test_consume_queue_not_specified (tests_0-10.message.MessageTests) ... ok
test_consume_unique_consumers (tests_0-10.message.MessageTests) ... ok
test_credit_flow_bytes (tests_0-10.message.MessageTests) ... ok
test_credit_flow_messages (tests_0-10.message.MessageTests) ... ok
test_empty_body (tests_0-10.message.MessageTests) ... ok
test_incoming_start (tests_0-10.message.MessageTests) ... ok
test_no_local (tests_0-10.message.MessageTests) ... ok
test_no_local_awkward (tests_0-10.message.MessageTests) ... ok
test_no_local_exclusive_subscribe (tests_0-10.message.MessageTests) ... ok
test_ranged_ack (tests_0-10.message.MessageTests) ... ok
test_reject (tests_0-10.message.MessageTests) ... ok
test_release (tests_0-10.message.MessageTests) ... ok
test_release_ordering (tests_0-10.message.MessageTests) ... ok
test_release_unacquired (tests_0-10.message.MessageTests) ... ok
test_subscribe_not_acquired (tests_0-10.message.MessageTests) ... ok
test_subscribe_not_acquired_2 (tests_0-10.message.MessageTests) ... ok
test_subscribe_not_acquired_3 (tests_0-10.message.MessageTests) ... ok
test_window_flow_bytes (tests_0-10.message.MessageTests) ... ok
test_window_flow_messages (tests_0-10.message.MessageTests) ... ok
test_ack_message_from_deleted_queue (tests_0-10.persistence.PersistenceTests) ... ok test_delete_queue_after_publish (tests_0-10.persistence.PersistenceTests) ... ok
test_queue_deletion (tests_0-10.persistence.PersistenceTests) ... ok
test_autodelete_shared (tests_0-10.queue.QueueTests) ... ok
test_bind (tests_0-10.queue.QueueTests) ... ok
test_bind_queue_existence (tests_0-10.queue.QueueTests) ... ok
test_declare_exclusive (tests_0-10.queue.QueueTests) ... ok
test_declare_passive (tests_0-10.queue.QueueTests) ... ok
test_delete_ifempty (tests_0-10.queue.QueueTests) ... ok
test_delete_ifunused (tests_0-10.queue.QueueTests) ... ok
test_delete_queue_exists (tests_0-10.queue.QueueTests) ... ok
test_delete_simple (tests_0-10.queue.QueueTests) ... ok
test_purge (tests_0-10.queue.QueueTests) ... ok
test_purge_empty_name (tests_0-10.queue.QueueTests) ... ok
test_purge_queue_exists (tests_0-10.queue.QueueTests) ... ok
test_unbind_direct (tests_0-10.queue.QueueTests) ... ok
test_unbind_fanout (tests_0-10.queue.QueueTests) ... ok
test_unbind_headers (tests_0-10.queue.QueueTests) ... ok
test_unbind_topic (tests_0-10.queue.QueueTests) ... ok
test_exchange_bound_direct (tests_0-10.query.QueryTests) ... ok
test_exchange_bound_fanout (tests_0-10.query.QueryTests) ... ok
test_exchange_bound_header (tests_0-10.query.QueryTests) ... ok
test_exchange_bound_topic (tests_0-10.query.QueryTests) ... ok
test_exchange_query (tests_0-10.query.QueryTests) ... ok
test_queue_query (tests_0-10.query.QueryTests) ... ok
test_queue_query_unknown (tests_0-10.query.QueryTests) ... ok

----------------------------------------------------------------------
Ran 110 tests in 88.510s

OK
PASS: python_tests
PASS: stop_broker
Running federation tests using brokers on ports 45428 45429
sh: objdump: not found
test_bridge_create_and_close (federation.FederationTests) ... ok
test_pull_from_exchange (federation.FederationTests) ... ok
test_pull_from_queue (federation.FederationTests) ... ok
test_tracing (federation.FederationTests) ... ok

----------------------------------------------------------------------
Ran 4 tests in 48.880s

OK
PASS: run_federation_tests
==============================================
1 of 8 tests failed
Please report to [email protected]
==============================================




Only a test is failing. There's also a weird message during unit_test (Exception in client dispatch thread: Connection closed by broker), and also those "sh: objdump not found" messages I'm still not sure where they're coming from, since at a first look I was not able to find any objdump invocation. Other than that, it gives me hope about having a solaris working version soon.

Thanks for all your support!

Best regards and happy weekend.
--
Manuel.




.


Reply via email to