[
https://issues.apache.org/jira/browse/HDFS-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stephen Bovy updated HDFS-5541:
-------------------------------
Attachment: pdclibhdfs.zip
Windows Porting Project ( and other nix comaptibility )
Testing with Hortonworks Windows Dist Based on hadoop 1.1.3 and with jdk
"1.6.0_31"
These changes are based on latest GA 2.0.xx release
Unix/Windows Compatibility Changes
And Some Performance Enhancments
Added "uthash" for windows hash table
#ifdef WIN32
#include "uthash.h"
#endif
Added many #def for windows vs unix
Added jvm-mutex macro
#ifdef WIN32
#define LOCK_JVM_MUTEX() \
dwWaitResult = WaitForSingleObject(hdfs_JvmMutex,INFINITE)
#else
#define LOCK_JVM_MUTEX() \
pthread_mutex_lock(&hdfs_JvmMutex)
#endif
#ifdef WIN32
#define UNLOCK_JVM_MUTEX() \
ReleaseMutex(hdfs_JvmMutex)
#else
#define UNLOCK_JVM_MUTEX() \
pthread_mutex_unlock(&hdfs_JvmMutex)
#endif
>> Dynamically load the jvm << ( more flexable ) ( and easier to build )
added simplistic starting point for lib init function
When this fucntion is used locking in getjni env can be avoided
int hdfsLibInit ( void * parms )
{
JNIEnv* env = getJNIEnv();
if (!env) return 1;
hdfs_InitLib = 1;
return 0;
}
Convert Thread local storage init to use
{ pthread_once ) to eliminate some locking issues
( see below ) ::
JNIEnv* getJNIEnv(void)
{
JNIEnv *env = NULL;
HDFSTLS *tls = NULL;
int ret = 0;
jint rv = 0;
#ifdef WIN32
DWORD dwWaitResult;
tls = TlsGetValue(hdfs_dwTlsIndex1);
if (tls) return tls->env;
#endif
#ifdef HAVE_BETTER_TLS
static __thread HDFSTLS *quickTls = NULL;
if (quickTls) return quickTls->env;
#endif
#ifndef WIN32
pthread_once(&hdfs_threadInit_Once, Make_Thread_Key);
if (!hdfs_gTlsKeyInitialized)
return NULL;
tls = pthread_getspecific(hdfs_gTlsKey);
if (tls) {
return tls->env;
}
#endif
if (!hdfs_InitLib) {
LOCK_JVM_MUTEX();
env = getGlobalJNIEnv();
UNLOCK_JVM_MUTEX();
} else {
rv = (*hdfs_JVM)->AttachCurrentThread(hdfs_JVM, (void**) &env, 0);
if (rv != 0) {
fprintf(stderr, "Call to AttachCurrentThread "
"failed with error: %d\n", rv);
return NULL;
}
}
if (!env) {
fprintf(stderr, "getJNIEnv: getGlobalJNIEnv failed\n");
return NULL;
}
tls = calloc ( 1, sizeof(HDFSTLS) );
if (!tls) {
fprintf(stderr, "getJNIEnv: OOM allocating %zd bytes\n",
sizeof(HDFSTLS) );
return NULL;
}
tls->env = env;
#ifdef WIN32
printf ( "dll: save environment\n" );
if (!TlsSetValue(hdfs_dwTlsIndex1, tls))
return NULL;
return env;
#endif
#ifdef HAVE_BETTER_TLS
quickTls = tls;
return env;
#endif
#ifndef WIN32
ret = pthread_setspecific(hdfs_gTlsKey, tls);
if (ret) {
fprintf(stderr, "getJNIEnv: pthread_setspecific failed with "
"error code %d\n", ret);
hdfsThreadDestructor(tls);
return NULL;
}
#endif
return env;
}
Also used ( pthread_once ) to init hash table and simplify hash table locking
static int insertEntryIntoTable ( const char *key, void *data )
{
ENTRY e, *ep = NULL;
if (key == NULL || data == NULL) {
return 0;
}
pthread_once ( &hdfs_hashTable_Once, hashTableInit );
if ( !hdfs_hashTableInited ) {
return -1;
}
>>>>>>>>>>>>>>>>>>
Note: Some recent enhancements are not backwards comaptible
/*This is not backwards comaptible */
/*
jthr = invokeMethod ( env, NULL, STATIC, NULL,
"org/apache/hadoop/fs/FileSystem",
"loadFileSystems", "()V" );
if (jthr) {
printExceptionAndFree ( env, jthr, PRINT_EXC_ALL,
"loadFileSystems" );
return NULL;
} */
>>>>>>>>>>>>>>>>>>>>>>
The "newInstance" functions are not backwards compatible
and therfore must be avoided
The new readDirect function produces a method error on windows jdk
64 bit 1.6.0_31
java version "1.6.0_31"
Java(TM) SE Runtime Environment (build 1.6.0_31-b05)
Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
could not find method read from class org/apache/hadoop/fs/FSDataInputStream wit
h signature (Ljava/nio/ByteBuffer;)I
readDirect: FSDataInputStream#read error:
Begin Method Invokation:org/apache/commons/lang/exception/ExceptionUtils ## getS
tackTrace
End Method Invokation
Method success
java.lang.NoSuchMethodError: read
hdfsOpenFile(/tmp/testfile.txt): WARN: Unexpected error 255 when testing for dir
ect read compatibility
>>>
And finally >>
Dag nab it >> I cannot figure this one out >> the append does not work
Begin Method Invokation:org/apache/hadoop/fs/FileSystem ## append
org.apache.hadoop.ipc.RemoteException: java.io.IOException: Append is not suppor
ted. Please see the dfs.support.append configuration parameter
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSName
system.java:1840)
at org.apache.hadoop.hdfs.server.namenode.NameNode.append(NameNode.java:
768)
> LIBHDFS questions and performance suggestions
> ---------------------------------------------
>
> Key: HDFS-5541
> URL: https://issues.apache.org/jira/browse/HDFS-5541
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs-client
> Reporter: Stephen Bovy
> Priority: Minor
> Attachments: pdclibhdfs.zip
>
>
> Since libhdfs is a "client" interface", and esspecially because it is a "C"
> interface , it should be assumed that the code will be used accross many
> different platforms, and many different compilers.
> 1) The code should be cross platform ( no Linux extras )
> 2) The code should compile on standard c89 compilers, the
> >>> {least common denominator rule applies here} !! <<
> C code with "c" extension should follow the rules of the c standard
> All variables must be declared at the begining of scope , and no (//)
> comments allowed
> >> I just spent a week white-washing the code back to nornal C standards so
> >> that it could compile and build accross a wide range of platforms <<
> Now on-to performance questions
> 1) If threads are not used why do a thread attach ( when threads are not used
> all the thread attach nonesense is a waste of time and a performance killer )
> 2) The JVM init code should not be imbedded within the context of every
> function call . The JVM init code should be in a stand-alone LIBINIT
> function that is only invoked once. The JVM * and the JNI * should be
> global variables for use when no threads are utilized.
> 3) When threads are utilized the attach fucntion can use the GLOBAL jvm *
> created by the LIBINIT { WHICH IS INVOKED ONLY ONCE } and thus safely
> outside the scope of any LOOP that is using the functions
> 4) Hash Table and Locking Why ?????
> When threads are used the hash table locking is going to hurt perfromance .
> Why not use thread local storage for the hash table,that way no locking is
> required either with or without threads.
>
> 5) FINALLY Windows Compatibility
> Do not use posix features if they cannot easilly be replaced on other
> platforms !!
--
This message was sent by Atlassian JIRA
(v6.1#6144)