[ https://issues.apache.org/jira/browse/HDFS-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stephen Bovy updated HDFS-5541: ------------------------------- Attachment: pdclibhdfs.zip Windows Porting Project ( and other nix comaptibility ) Testing with Hortonworks Windows Dist Based on hadoop 1.1.3 and with jdk "1.6.0_31" These changes are based on latest GA 2.0.xx release Unix/Windows Compatibility Changes And Some Performance Enhancments Added "uthash" for windows hash table #ifdef WIN32 #include "uthash.h" #endif Added many #def for windows vs unix Added jvm-mutex macro #ifdef WIN32 #define LOCK_JVM_MUTEX() \ dwWaitResult = WaitForSingleObject(hdfs_JvmMutex,INFINITE) #else #define LOCK_JVM_MUTEX() \ pthread_mutex_lock(&hdfs_JvmMutex) #endif #ifdef WIN32 #define UNLOCK_JVM_MUTEX() \ ReleaseMutex(hdfs_JvmMutex) #else #define UNLOCK_JVM_MUTEX() \ pthread_mutex_unlock(&hdfs_JvmMutex) #endif >> Dynamically load the jvm << ( more flexable ) ( and easier to build ) added simplistic starting point for lib init function When this fucntion is used locking in getjni env can be avoided int hdfsLibInit ( void * parms ) { JNIEnv* env = getJNIEnv(); if (!env) return 1; hdfs_InitLib = 1; return 0; } Convert Thread local storage init to use { pthread_once ) to eliminate some locking issues ( see below ) :: JNIEnv* getJNIEnv(void) { JNIEnv *env = NULL; HDFSTLS *tls = NULL; int ret = 0; jint rv = 0; #ifdef WIN32 DWORD dwWaitResult; tls = TlsGetValue(hdfs_dwTlsIndex1); if (tls) return tls->env; #endif #ifdef HAVE_BETTER_TLS static __thread HDFSTLS *quickTls = NULL; if (quickTls) return quickTls->env; #endif #ifndef WIN32 pthread_once(&hdfs_threadInit_Once, Make_Thread_Key); if (!hdfs_gTlsKeyInitialized) return NULL; tls = pthread_getspecific(hdfs_gTlsKey); if (tls) { return tls->env; } #endif if (!hdfs_InitLib) { LOCK_JVM_MUTEX(); env = getGlobalJNIEnv(); UNLOCK_JVM_MUTEX(); } else { rv = (*hdfs_JVM)->AttachCurrentThread(hdfs_JVM, (void**) &env, 0); if (rv != 0) { fprintf(stderr, "Call to AttachCurrentThread " "failed with error: %d\n", rv); return NULL; } } if (!env) { fprintf(stderr, "getJNIEnv: getGlobalJNIEnv failed\n"); return NULL; } tls = calloc ( 1, sizeof(HDFSTLS) ); if (!tls) { fprintf(stderr, "getJNIEnv: OOM allocating %zd bytes\n", sizeof(HDFSTLS) ); return NULL; } tls->env = env; #ifdef WIN32 printf ( "dll: save environment\n" ); if (!TlsSetValue(hdfs_dwTlsIndex1, tls)) return NULL; return env; #endif #ifdef HAVE_BETTER_TLS quickTls = tls; return env; #endif #ifndef WIN32 ret = pthread_setspecific(hdfs_gTlsKey, tls); if (ret) { fprintf(stderr, "getJNIEnv: pthread_setspecific failed with " "error code %d\n", ret); hdfsThreadDestructor(tls); return NULL; } #endif return env; } Also used ( pthread_once ) to init hash table and simplify hash table locking static int insertEntryIntoTable ( const char *key, void *data ) { ENTRY e, *ep = NULL; if (key == NULL || data == NULL) { return 0; } pthread_once ( &hdfs_hashTable_Once, hashTableInit ); if ( !hdfs_hashTableInited ) { return -1; } >>>>>>>>>>>>>>>>>> Note: Some recent enhancements are not backwards comaptible /*This is not backwards comaptible */ /* jthr = invokeMethod ( env, NULL, STATIC, NULL, "org/apache/hadoop/fs/FileSystem", "loadFileSystems", "()V" ); if (jthr) { printExceptionAndFree ( env, jthr, PRINT_EXC_ALL, "loadFileSystems" ); return NULL; } */ >>>>>>>>>>>>>>>>>>>>>> The "newInstance" functions are not backwards compatible and therfore must be avoided The new readDirect function produces a method error on windows jdk 64 bit 1.6.0_31 java version "1.6.0_31" Java(TM) SE Runtime Environment (build 1.6.0_31-b05) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) could not find method read from class org/apache/hadoop/fs/FSDataInputStream wit h signature (Ljava/nio/ByteBuffer;)I readDirect: FSDataInputStream#read error: Begin Method Invokation:org/apache/commons/lang/exception/ExceptionUtils ## getS tackTrace End Method Invokation Method success java.lang.NoSuchMethodError: read hdfsOpenFile(/tmp/testfile.txt): WARN: Unexpected error 255 when testing for dir ect read compatibility >>> And finally >> Dag nab it >> I cannot figure this one out >> the append does not work Begin Method Invokation:org/apache/hadoop/fs/FileSystem ## append org.apache.hadoop.ipc.RemoteException: java.io.IOException: Append is not suppor ted. Please see the dfs.support.append configuration parameter at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSName system.java:1840) at org.apache.hadoop.hdfs.server.namenode.NameNode.append(NameNode.java: 768) > LIBHDFS questions and performance suggestions > --------------------------------------------- > > Key: HDFS-5541 > URL: https://issues.apache.org/jira/browse/HDFS-5541 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client > Reporter: Stephen Bovy > Priority: Minor > Attachments: pdclibhdfs.zip > > > Since libhdfs is a "client" interface", and esspecially because it is a "C" > interface , it should be assumed that the code will be used accross many > different platforms, and many different compilers. > 1) The code should be cross platform ( no Linux extras ) > 2) The code should compile on standard c89 compilers, the > >>> {least common denominator rule applies here} !! << > C code with "c" extension should follow the rules of the c standard > All variables must be declared at the begining of scope , and no (//) > comments allowed > >> I just spent a week white-washing the code back to nornal C standards so > >> that it could compile and build accross a wide range of platforms << > Now on-to performance questions > 1) If threads are not used why do a thread attach ( when threads are not used > all the thread attach nonesense is a waste of time and a performance killer ) > 2) The JVM init code should not be imbedded within the context of every > function call . The JVM init code should be in a stand-alone LIBINIT > function that is only invoked once. The JVM * and the JNI * should be > global variables for use when no threads are utilized. > 3) When threads are utilized the attach fucntion can use the GLOBAL jvm * > created by the LIBINIT { WHICH IS INVOKED ONLY ONCE } and thus safely > outside the scope of any LOOP that is using the functions > 4) Hash Table and Locking Why ????? > When threads are used the hash table locking is going to hurt perfromance . > Why not use thread local storage for the hash table,that way no locking is > required either with or without threads. > > 5) FINALLY Windows Compatibility > Do not use posix features if they cannot easilly be replaced on other > platforms !! -- This message was sent by Atlassian JIRA (v6.1#6144)