Tuesday, September 14, 2010

Java Crashing in libc.so.6

I must share this horrible experience although it is not directly related to programming. But I hope that article will be helpful for somebody.

I use Java in my browser for practical things like digital signing of requests when communicating with offices and banks. I use it very often and without any serious problems. Of course I had to configure my truststore and keystore and I also have two versions of extension (jre/lib/ext) directory, because different offices and banks use different versions of the IAIK library. But it is another story.

But one nice morning when I tried to connect to a page containing an applet, my Java console window (which I have configured to be always started) flashed and the entire browser crashed. I used Java JDK 1.6.0_04 (32bit on 64bit Linux machine - kernel 2.6.23).

I tried again and again, but with the same result - crash. After a while I found, that there are generated files with name "hs_err_pidXXXX.log" wher XXXX was a PID of the process which crashed. I looked into it - it began with these lines:


#
# An unexpected error has been detected by Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x0094a8ac, pid=5587, tid=4108254096
#
# Java VM: Java HotSpot(TM) Client VM (10.0-b19 mixed mode, sharing linux-x86)
# Problematic frame:
# C [libc.so.6+0x718ac] memcpy+0x1c
#
# If you would like to submit a bug report, please visit:
# http://java.sun.com/webapps/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#


And then a lot of stack traces, thread states and so on, which I was not interested.

So it looked like a problem in libc.so.6 ! It was strange, because I used Java in my browser a day before without problems.

I realized that was the time to upgrade my browser, so I tried these steps:

  • updating from FireFox 3.0 to FireFox 3.6 - hmm, the Java plugin (libjavaplugin_oji.so) did not worked

  • updating from JDK 1.6.0_04 to JDK 1.6.0_21 - hmm, the Java plugin (libjavaplugin_oji.so) still did not worked - using info at FireFox Java Plugin Doc

  • Finally I found info about new Java plugin technology here, so Java was able to start in FireFox 3.6, but crashed again - only the Java process, not FireFox, but I needed Java...



I tried to clear all my FireFox user profile (including cache, cookies..) - but Java crashed again.

I tried to clear all my Java user profile data - but the result was the same - crash!

I even simulated crash with Java only - no browser plugin. So now it seemed that problem is not in the plugin or browser at all.

There was still the same info in the hserr_pidXXXX.log - "problematic frame in libc.so.6". Ok, it was the time to look at this.

I found that I use libc-2.7.so. It seemed that problem could be there. OK, I had to try to compile the newest version of it and try to start Java with it (certainly I could not change the core library for the whole system !!)

(Un)Fortunately I could not compile the libc sources.
It was horrible - only reinstalling the whole system was my last chance to bring Java to life again.

I looked into the log file again and tried to decompile the info there:


Stack: [0xf4d9f000,0xf4df0000], sp=0xf4d6e0f4, free space=-196k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C [libc.so.6+0x718ac] memcpy+0x1c
C [libc.so.6+0x59c01] _IO_getline+0x41
C [libc.so.6+0x6236b] fgets_unlocked+0x5b
C [libnss_files.so.2+0x3045]
C [libnss_files.so.2+0x382b] _nss_files_gethostbyname2_r+0xdb


The last line was a little sun in dark. It looked like a problem when translating DNS names to IP addresses using file - /etc/hosts !

I looked into /etc/hosts and I was surprised - there were two lines (records) with duplicating host names:
127.0.0.1 localhost localhost localhost .....
The whole file had 3.3MB in size !! That was the reason that it crashed in the libc - it was an overflow error caught by the system and killing the process !

And why Java only was crashing ? Maybe other applications would crash soon as well. Java was crashing just after the start, because it tried to resolve DNS names stored in the cache - in the cache of loaded applets - that was the reason !!

Uff, I was very happy, that the solution was so simple at the end.