Sunday, October 26, 2014

Piped merge error - what is wrong?

Last week a friend of mine came to me to ask about a strange error he was getting on TNPM. Basically, he had many gaps on report data for all devices, and it was apparently intermittent.

The error message on the log was the following:


V1:3353 2014.10.22-02.08.43 UTC LDR.1-21884:9897        SQLLDR  2 SQL Loader started
V1:3354 2014.10.22-02.08.43 UTC LDR.1-21884:13196       SQLLDR  3 Starting Piped Merge
V1:3355 2014.10.22-02.09.10 UTC LDR.1-21884:13196       MERGE_ERROR     GYMDC10118F Piped Merge Error: No such device or address:Transfer error
V1:3356 2014.10.22-02.09.10 UTC LDR.1-21884:13196       SQLLDRKILL      GYMDC10102W Killed sqlldr pid=a UnixProcess (Inactive: exitStatus nil, Error: Success) , result=a UnixProcess (Inactive: exitStatus nil, Error: Success)
V1:3357 2014.10.22-02.09.10 UTC LDR.1-21884:9897        ORASQLLDR       GYMDC10104F  ErrorCode=nil CommandLine=$ORA_HOME/11.2.0-client32/bin/sqlldr userid=PV_LDR_01/xxxx@pv log=...datachannel/LDR.1/state/2014.10.22-00/MERGED~000.1DGA.BOF.log control=...datachannel/LDR.1/loader.gagg.ctl logErrors=

After some investigation, it was evident that for some reason, the unix pipe created during the data merge was getting corrupted.

It is important to know that the LDR component has two options for merging and loading the data files. On the topology editor, if the option "USE_PIPE" is false, it will generate an intermediate file with the merged data and then use this file to upload via oracle sqlldr. If "USE_PIPE" is true, it will create a unix pipe and the oracle sqlldr will use it to load the files. Some people say that using the pipe is faster, because you don't have to create the intermediate file, but this can cause issues as well, as we will see.

The TNPM system I mentioned was using the pipe method for quite a long time before the issue occurred. So it, should be something in the system itself that had changed. And indeed, it was.

When using the pipe method, the pipe pointer is created under /tmp/LDR.X wher X is the LDR channel number. This works fine if the /tmp is mounted locally, but, if for any reason, the IT team decides to mount it using a remote data store... well, you will have problems. This was exactly what happened. The IT team decide to mount the /tmp using a remote data store for the VMware cluster. Once the datachannel was running on the cluster, the pipe load was affected.

So, we deactivated the pipe (USE_PIPE=false) on the topology editor, deployed the topology and the problem was solved.

I could not find a way to change where to create the pipe pointer, so it must be hard-coded somewhere. If you know how to do it, let me know :)

Sunday, October 19, 2014

link($DC_HOME/bin/visual,CMGR_visual) failed with error 18

If you ever installed the old version of TNPM (Proviso), you know that it was not possible to split the datachannel binaries from the data using two different locations. This was not very smart, considering that it is a common practice among many companies.

Since version 4.4.1 (I believe), this option exists and you can configure different locations for your datachannel binary files and the data files.

But, there is a catch if you use different partitions for the split (what is also the common practice...your data partition is usually remote mounted from the company data cluster).

The limitation is on the visual binary. You cannot execute it from a remote partition. If you try so, you will receive the following error on the screen:

link($DC_HOME/bin/visual,CMGR_visual) failed with error 18

This happens for all the tools that use the visual binary to bootstrap themselves (cmgr, amgr, frmi, etc...).

Fortunately, the solution is very simple:

1) Go to the $DC_HOME/bin folder
2) Open the run script used to start the tool. For instance "cmgr" for the CMGR component.
3) Go to the end of the script and add the line in red right above the last line:

cd $DC_BIN_HOME

$DC_BIN_HOME/pvexec CMGR_visual $DC_BIN_HOME/visual -nologo -noherald $DC_BIN_HOME/dc.im -headless -a CMGR "$@"

This will make sure you go back to the partition where your datachannel binaries are installed, before executing the visual command.

You have to do the same for all run scripts.

That's it.


Thursday, September 18, 2014

Datachannel component is not listed on "dccmd status all"

Suppose you cannot see BCOL.1.1 when you execute dccmd status all. Below, some things to investigate:

1) Try to run it manually. Run:
dccmd start BCOL.1.1

2) Check the proviso.log. Run:
grep "BCOL\.1\.1" proviso.log and search for any error message

3) If you see a WALKBACK error on the log, search for the walkback file and read the first lines to see the main reason for the walkback 

4) Check if the component configuration exists in the database. Run:
dccmd debug CMGR "self dbCfgPrint" | grep BCOL.1.1

This should show the component configuration in the database. Check for any mistakes and correct it using the topologyEditor

If you don't see any value, the component was not created on the topologyEditor or was not saved correctly in the database.

5) Check if any filesystem is full. Run:
df -kh and make sure no partition has 100% utilization

6) Restart some datachannel manager components (cmgr, amgr, cns) and try again.

If you still cannot find the reason, it is a good moment to open a ticket towards IBM 

Thursday, April 10, 2014

Tivoli Integrated Portal (TIP) portlet authorization

If you ever created custom pages in TIP, you may have encountered authorization issues. This is because you have to set the user authorization on different levels. The official documentation guides you on creating roles, groups, users and setting their relationships, but doesn't mention about portlet authorization.

Each page you create can include a portlet. The most common is the Web Widget, necessary to open a web page. By default, only the "administrator" role has access to it. So, if you create a page, add the web widget, and give access to non admin users to the page, when they try to access it, they will have a blank (grey) page and no error message displayed.

When that happens, go to the server log "SystemOut.log" and look for the following error:

"doStartTag() user does not have permissions for view mode"

Usually this is related to the portlet authorization. You will have to configure it as well.

To configure the portlet authorization, go to "Settings -> portlets" and click on the portlet. This will open the portlet configuration page. Click "Next" until you find the Security tab:


You can now add the correct roles to the authorization containers. Select "User" and add the role for you final user. Click "Next" and "Finish".

Now you can login again with the final user, open the page and the portlet will work fine.

Tuesday, March 25, 2014

TNPM 1.3.2 support for IE9 and Firefox ESR10

This link explains how to add support for Internet Explorer 9 and Firefox ESR10, but it is quite old.

I've just installed the last fixpacks for TCR and TIP and the steps I followed can be seen below:

Download the necessary packages

From IBM fixcentral download the following (please match the correct bit set for your installation):
  • Tivoli Netcool Performance Manager 1.3.2.0-TIV-TNPM-IF0042
  • Tivoli Integrated Portal 2.2.0-TIV-TIP-Linux64-FP0011
  • Tivoli Integrated Portal 2.2.0-TIV-TIP-FITSuit-FP00011
  • Tivoli Common Report 2.1.1.0-TIV-TCR-Linux64-FP2


Install in the following sequence (mandatory)


1) Tivoli Netcool Performance Manager 1.3.2.0-TIV-TNPM-IF0042

      Follow the instructions here

2) Tivoli Integrated Portal 2.2.0-TIV-TIP-Linux64-FP0011

      Follow the instructions on the 22011-fixpack-guide-PDF.pdf file (downloaded together with the packages from fixcentral)

3) Tivoli Common Report 2.1.1.0-TIV-TCR-Linux64-FP2

      Follow the instructions on the TCR211_FP2_Readme.txt file (downloaded together with the packages from fixcentral)

NOTE1: If for any reason you want to install using the console mode (Install.bin -i console), it still requires a valid DISPLAY set.

NOTE2: If for any reason you want to install using the silent mode (Install.bin -i silent -f <response_file>), you will not see any output at all on the shell, but the fix pack will run. You can follow up the process using the logs on <TCR_home>tipv2Components/TCRComponent/logs


Friday, February 7, 2014

Installing TNPM 1.3.2 with RHEL 5.9 Issues and Workarounds

Hi, I'm currently installing a fresh TNPM 1.3.2 platform and I believe some of the issues I found could be of interest. So, as I find them, I will list them here.


Oracle 32bit client MUST be installed on the 64bit server as well

Despite the documentation states the opposite ( see here ), it is mandatory to have the 32bit installed, otherwise, the topologyEditor and deployer won't be able to connect to the database. You MUST use the 32bit JDBC version when installing the topologyEditor and you MUST configure on the topologyEditor the ORACLE_HOME_ROOT property to point to your 64bit oracle install and ORACLE_HOME property to point to the 32bit client (and not the 64bit installed server) for you database host


Missing package for topologyEditor installation

If you try to install the topologyEditor and get this message:


"Graphical installers are not supported by the VM. The console mode will be used instead..."

It is because some 32bit library is missing. In my case, for RHEL 5.9 the missing 32bit libraries were libXres.i386 and libXtst.i386

Please notice that the same error doesn't occur for RHEL 5.5...


Missing packages for provisoinfo GUI on Datamart

If you see the following error message when opening the provisoinfod:

"java.lang.UnsatisfiedLinkError : awt (An exception was pending after running JNI_OnLoad)"

Please make sure all libX... packages have the 64bit and 32bit installed

# yum list installed | grep libX
libX11.i386                                1.0.3-11.el5_7.1            installed
libX11.x86_64                              1.0.3-11.el5_7.1            installed
libXTrap.i386                              1.0.0-3.1                   installed
libXTrap.x86_64                            1.0.0-3.1                   installed
libXau.i386                                1.0.1-3.1                   installed
libXau.x86_64                              1.0.1-3.1                   installed
libXaw.i386                                1.0.2-8.1                   installed
libXaw.x86_64                              1.0.2-8.1                   installed
libXcursor.i386                            1.1.7-1.2                   installed
libXcursor.x86_64                          1.1.7-1.2                   installed
libXdmcp.i386                              1.0.1-2.1                   installed
libXdmcp.x86_64                            1.0.1-2.1                   installed
libXext.i386                               1.0.1-2.1                   installed
libXext.x86_64                             1.0.1-2.1                   installed
libXfixes.i386                             4.0.1-2.1                   installed
libXfixes.x86_64                           4.0.1-2.1                   installed
libXfont.i386                              1.2.2-1.0.4.el5_7           installed
libXfont.x86_64                            1.2.2-1.0.4.el5_7           installed
libXfontcache.i386                         1.0.2-3.1                   installed
libXfontcache.x86_64                       1.0.2-3.1                   installed
libXft.i386                                2.1.10-1.1                  installed
libXft.x86_64                              2.1.10-1.1                  installed
libXi.i386                                 1.0.1-4.el5_4               installed
libXi.x86_64                               1.0.1-4.el5_4               installed
libXinerama.i386                           1.0.1-2.1                   installed
libXinerama.x86_64                         1.0.1-2.1                   installed
libXmu.i386                                1.0.2-5                     installed
libXmu.x86_64                              1.0.2-5                     installed
libXp.i386                                 1.0.0-8.1.el5               installed
libXp.x86_64                               1.0.0-8.1.el5               installed
libXpm.i386                                3.5.5-3                     installed
libXpm.x86_64                              3.5.5-3                     installed
libXrandr.i386                             1.1.1-3.3                   installed
libXrandr.x86_64                           1.1.1-3.3                   installed
libXrender.i386                            0.9.1-3.1                   installed
libXrender.x86_64                          0.9.1-3.1                   installed
libXres.i386                               1.0.1-3.1                   installed
libXres.x86_64                             1.0.1-3.1                   installed
libXt.i386                                 1.0.2-3.2.el5               installed
libXt.x86_64                               1.0.2-3.2.el5               installed
libXtst.i386                               1.0.1-3.1                   installed
libXtst.x86_64                             1.0.1-3.1                   installed
libXv.i386                                 1.0.1-4.1                   installed
libXv.x86_64                               1.0.1-4.1                   installed
libXxf86dga.i386                           1.0.1-3.1                   installed
libXxf86dga.x86_64                         1.0.1-3.1                   installed
libXxf86misc.i386                          1.0.1-3.1                   installed
libXxf86misc.x86_64                        1.0.1-3.1                   installed
libXxf86vm.i386                            1.0.1-3.1                   installed
libXxf86vm.x86_64                          1.0.1-3.1                   installed

Using oracle instance name different than PV

If you need to use another instance name, please make sure that you are defining EXACTLY the same value everywhere, specially the correct letter case. This is case-sensitive for the database instance creation!!

CheckDB script fails mentioning the os version is not supported

As you can see in the image below, the RHEL 5 updates 1 to 10 are supported by TNPM 1.3.2, but the checkdb script expects version 5.5U (hardcoded on checkdb.ini). Change the value in the checkdb.ini file and it should work.




Manual copy and execution of install packages

When installing TNPM, the deployer assumes you will enter the password for the application user on all remote servers and the root user. This is used to copy and execute the install package remotely. Now, this works fine when you have those passwords, but, in a real company environment, it will be unlikely you will receive them.

It is common to be able to do a "sudo" to the required application user and to transfer files using ssh shared keys among the servers (so no password needed).

Unfortunately, the deployer is not capable of transferring files remotely using ssh shared keys (this was identified as an APAR on Nov 2011...)

So, the solution is to disable the "remote file transfer" and "remote command execution" and do the work manually. The only problem is that there is another bug related to the install package that should be copied over (APAR opened on Jan 2012...)

Below you can find the steps necessary to manually create and execute the packages.

On the deployer disable the "remote file transfer" and the "remote execution". After, execute the installation plan step by step. It will give you an error for each step related to remote hosts. Once you see the error, right-click on the step and check the command output. You will see something mentioning you have to copy over the files under /tmp/ProvisoConsumer/(...)/runtime directory. Follow the instructions below for each step.

Deployer step: OS check

On the deployer host as root user:

  1. # cd /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00001_CheckSys_OS_step_1_1 /runtime
  2. # mkdir package
  3. # cd package/
  4. # cp ../../bin/* .
  5. # cd ../../
  6. # tar cvf runtime.tar runtime/
On the remote host as application user:

  1. # mkdir -p /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00001_CheckSys_OS_step_1_1/runtime
  2. # cd /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00001_CheckSys_OS_step_1_1
  3. # scp <deployer_host>:/tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00001_CheckSys_OS_step_1_1/runtime.tar .
  4. # tar xvf runtime.tar
On the remote host as root user:

  1. # cd /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00001_CheckSys_OS_step_1_1/runtime
  2. # ksh run.sh
If successfully executed (return code(s) = 0), go back to the deployer and change the step status as success for the "preparation" step and the "execution" step.

Deployer step: Database check

On the deployer host as root user:

  1. # cd /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00002_CheckSys_DB_step_1_2/runtime
  2. # mkdir package
  3. # cd package/
  4. # cp ../../bin/* .
  5. # cd ../../
  6. # tar cvf runtime.tar runtime/
On the remote host as application user:

  1. # mkdir -p /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00002_CheckSys_DB_step_1_2/runtime
  2. # cd /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00002_CheckSys_DB_step_1_2
  3. # scp <deployer_host>:/tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00002_CheckSys_DB_step_1_2/runtime.tar .
  4. # tar xvf runtime.tar
On the remote host as root user:

  1. # cd /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00002_CheckSys_DB_step_1_2/runtime
  2. # ksh run.sh
If successfully executed (return code(s) = 0), go back to the deployer and change the step status as success for the "preparation" step and the "execution" step.

Deployer step: JRE check

On the deployer host as root user:

  1. # cd /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00003_JRE_step_2_1/runtime
  2. # mkdir package
  3. # cd package/
  4. # cp <tnpm_image>/v132/proviso/RHEL/jre/RHEL5/ibm\-java\-jre\-6.0\-8.1\-linux\-i386.tgz .
  5. # cd ../../
  6. # tar cvf runtime.tar runtime/
On the remote host as application user:

  1. # mkdir -p /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00003_JRE_step_2_1/runtime
  2. # cd /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00003_JRE_step_2_1
  3. # scp <deployer_host>:/tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00003_JRE_step_2_1/runtime.tar .
  4. # tar xvf runtime.tar
On the remote host as application user:

  1. # cd /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00003_JRE_step_2_1/runtime
  2. # ksh run.sh
If successfully executed (return code(s) = 0), go back to the deployer and change the step status as success for the "preparation" step and the "execution" step.

Deployer step: Remote DataChannel setup

On the deployer host as root user:

  1. # cd /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00004_DataChannel_step_8_1/runtime
  2. # mkdir package
  3. # cd package/
  4. # cp -R <tnpm_image>/v132/proviso/RHEL/DataChannel/* .
  5. # cd ../../
  6. # tar cvf runtime.tar runtime/
On the remote host as application user:

  1. # mkdir -p /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00004_DataChannel_step_8_1/runtime
  2. # cd /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00004_DataChannel_step_8_1
  3. # scp <deployer_host>:/tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00004_DataChannel_step_8_1/runtime.tar .
  4. # tar xvf runtime.tar
On the remote host as application user:

  1. # cd /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00004_DataChannel_step_8_1/runtime
  2. # ksh run.sh
If successfully executed (return code(s) = 0), go back to the deployer and change the step status as success for the "preparation" step and the "execution" step.

Deployer step: Remote Datamart setup

On the deployer host as root user:

  1. # cd /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00003_DataMart_step_7_1/runtime
  2. # mkdir package
  3. # cd package/
  4. # cp -R <tnpm_image>/v132/proviso/RHEL/DataMart/RHEL5/* .
  5. # cd ../../
  6. # tar cvf runtime.tar runtime/
On the remote host as application user:

  1. # mkdir -p /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00003_DataMart_step_7_1/runtime
  2. # cd /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00003_DataMart_step_7_1
  3. # scp <deployer_host>:/tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00003_DataMart_step_7_1/runtime.tar .
  4. # tar xvf runtime.tar
On the remote host as root user:

  1. # cd /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00003_DataMart_step_7_1/runtime
  2. # ksh run.sh
If successfully executed (return code(s) = 0), go back to the deployer and change the step status as success for the "preparation" step and the "execution" step.

Deployer step: Remote TIP (TCR) setup

Please install TCR manually using its own installer and using the application username. After installing, mark this step as successful.

DO NOT install TCR using the deployer, once it will install the portal as root and as if that was not unsafe enough for an application portal (usually a public portal), it will execute chmod -R 777 <tip_home> as the last step. (yes, you can yell now...) 

Deployer step: Remote Dataview setup

On the deployer host as root user:

  1. # cd /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00004_DataView_step_10_1/runtime
  2. # mkdir package
  3. # cd package/
  4. # cp -R <tnpm_image>/v132/proviso/RHEL/DataView/RHEL5/* .
  5. # cd ../../
  6. # tar cvf runtime.tar runtime/
On the remote host as application user:

  1. # mkdir -p /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00004_DataView_step_10_1/runtime
  2. # cd /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00004_DataView_step_10_1
  3. # scp <deployer_host>:/tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00004_DataView_step_10_1/runtime.tar .
  4. # tar xvf runtime.tar
On the remote host as application user:

  1. # cd /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00004_DataView_step_10_1/runtime
  2. # ksh run.sh
If successfully executed (return code(s) = 0), go back to the deployer and change the step status as success for the "preparation" step and the "execution" step.

Deployer step: Remote Dataload setup

On the deployer host as root user:

  1. # cd /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00003_DataLoad_step_9_1/runtime
  2. # mkdir package
  3. # cd package/
  4. # cp -R <tnpm_image>/v132/proviso/RHEL/DataLoad/* .
  5. # cd ../../
  6. # tar cvf runtime.tar runtime/
On the remote host as application user:

  1. # mkdir -p /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00003_DataLoad_step_9_1/runtime
  2. # cd /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00003_DataLoad_step_9_1
  3. # scp <deployer_host>:/tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00003_DataLoad_step_9_1/runtime.tar .
  4. # tar xvf runtime.tar
On the remote host as root user:

  1. # cd /tmp/ProvisoConsumer/Plan/MachinePlan_<remote_host>/00003_DataLoad_step_9_1/runtime
  2. # ksh run.sh
If successfully executed (return code(s) = 0), go back to the deployer and change the step status as success for the "preparation" step and the "execution" step.