Monitoring Component Pack / Kubernetes with Prometheus – Grafana

You have successfully installed HCL ComponentPack for HCL Connections. Now you want to know what is happening in that black box ?
The goal is to see something like this

Steps to reproduce:
Update Kubernetes to 1.18.12
Install Helm3
– Install nfs-client
– install metrics
Install Traefik
– Install Prometheus

Disclaimer: use the following on your own risk.

Update Kubernetes to 1.18 because it’s now supported for Component Pack. If you are still on 1.11 you will be surprised after the update to 1.16. The coredns will no longer start. Coredns removed the proxy plugin, you need to switch to the forward plugin:
kubectl -n kube-system edit cm coredns
change ‘proxy . /etc/resolv.conf’ to ‘forward . /etc/resolv.conf’.
Going to 1.19 has not been an option. Some of the helm charts provide incompatible ingresss or services due to the api changes between 1.18 and 1.19

Installing Helm 3
At the moment the Component Pack still requires Helm 2. But both versions can be used in parallel.
Before installing Helm 3:
mv /usr/local/bin/helm /usr/local/bin/helm2
and after installing Helm 3:
mv /usr/local/bin/helm /usr/local/bin/helm3
Now my environment knows helm2 and helm3. It’s possible to migrate from helm 2 to helm 3 but I wanted an emergency option to be able to re-install Component Pack if needed. helm2 list shows the Component Pack installs. helm3 list will show all the new stuff.

nfs-client:
I’m lazy. I already have a working nfs server in my environment. I don’t want to handle every single pv/pvc manually. That’s why I use the nfs client provisioner. I cloned the repo and applied the 3 yaml files. I updated the deploy\deployment.yaml with the values for my nfs server.
kubectl apply -f deploy\class.yaml
kubectl apply -f deploy\deployment.yaml
kubectl apply -f deploy\rbac.yaml
kubectl patch storageclass managed-nfs-storage -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

The last line defines the new storage class as default.

metrics:
get the yaml and add the –kubelet-insecure-tls to the args.
I was not able to get the metrics pod up without this. Probably proper certificates with all the right SAN’s would help. But as this is only in my lab, I don’t care.


wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.4.1/components.yaml


- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --kubelet-insecure-tls
kubectl apply -f components.yaml

Now you should get the metrics pod in the kube-system namespace.

Traefik:
helm3 show values traefik/traefik > values.yml to get the variables from the helm chart.

In order to prevent conflicts between the different ingress controllers I added this ingressClass.
Save it in ic.yaml and kubectl apply -f ic.yaml

apiVersion: networking.k8s.io/v1beta1
kind: IngressClass
metadata:
name: traefik-lb
spec:
controller: traefik.io/ingress-controller

There are 2 ingress controllers on my system now Traefik and cnx-ingress.

I defined the listening ports for web and websecure as 31080 and 31443 with
kubectl edit svc traefik.

There’s also a dashboard available for traefik.

Prometheus/Grafana
There are a lot of tutorials out there on how to install and configure prometheus.

So the compressed version:


helm3 repo add stable https://charts.helm.sh/stable
helm3 repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm3 repo update
helm3 show values prometheus-community/kube-prometheus-stack > values.yaml

edit values.yaml file:
– enable the ingress creation, add the ingressClassName: traefik-lb and define the hostname
– assign the volumeclaimtemplates the storage class, managed-nfs-storage in my case

storage:
volumeClaimTemplate:
spec:
storageClassName: managed-nfs-storage
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 20Gi
# selector: {}


kubectl create namespace monitoring
helm3 install prometheus prometheus-community/kube-prometheus-stack -n monitoring -f values.yaml
helm3 install metrics-adapter prometheus-community/prometheus-adapter -n monitoring


update the grafana ingress, add the ingressClassName
kubectl -n monitoring edit ingress prometheus-grafana

spec:
ingressClassName: traefik-lb
rules:
- host: grafana.ume.li

If you are using more then one node, make sure that the appropriate network ports (9000/9100 TCP) are open between the nodes.

Next step is to create the routes on your front Proxy/LoadBalancer so that the site is available under a nice url.
The embedded grafana has already the prometheus configured. The only thing I added was this dashboard

HCL Connections Activities Plus

Ran into a small issue with the activities plus. If you only want to install the activities plus in your test kubernetes cluster and you dont want to upload all images, then you will fail.

the support/setupImages.sh script has 2 errors in it.

  1. it does not recognize the parameter kudos-boards. ./setupImages.sh -st kudos-boards …. will fail. You need to change the line ‘starter_stack_options=’ and add ‘kudos-boards’ to the list.
  2. if you only run setupImages.sh -st kudos-boards it will not push all the required stuff. you need at least to run it with -st customizer,kudos-boards
    or change the #infra block in setupImages.sh to

            # Infra
            if [[ ${STARTER_STACKS} =~ "customizer" ]] || [[ ${STARTER_STACKS} =~ "orientme" ]] || [[ ${STARTER_STACKS} =~ "kudos-boards" ]]; then
                    arr=(haproxy.tar mongodb-rs-setup.tar mongodb.tar redis-sentinel.tar redis.tar appregistry-client.tar appregistry-service.tar)
                    for f in "${arr[@]}"; do
                            setup_image $f
                    done
            fi
    

I don’t like the default persistentVolume paths /pv-connections. I use /data for my test environment.
This needs a small tweak in the boards-cp.yaml file:

minio:
    persistentVolumePath: data
    nfs:
       server: 192.168.1.2

Connections 6.5 – Invite first impressions

Yesterday we upgraded our productive environment from Connections 6.0 CR6 to 6.5. This time we decided to do an in-place update. Updateing the base from WAS 8.5.5.15 to 8.5.5.16 and 6.0 CR6 to 6.5 would have taken around 2 hours of downtime in our case. Because we had an invalid proxy-config.tpl file (our fault), which broke the install process, it took 4h. Next time we should check the xml files first…

Today was “Invite” testing day. Combining the infos in the selfregistration-config.xml file and the hcl info we’ve been able to do the first tests.
Our current external users implementation, domino with an external directory does not fit for the new invite feature. The feature relies on ldap. It writes the user to ldap and then to the profiles. I’ve not been able to force the LDAP task to create the new user in my secondary directory.
I’ve now setup an openldap server for the external users.
If you do not want to handle openldap there’s a Domino way

As an external user I’m not able to change my password at the moment, but I expect this to be possible in a future update. The workaround would be to use the “reset guest password” workflow on the login page.

One thing I had to change in AppSrv01/installedApps/[cell]/Invite.ear/invite.war/pages/register.jsp:
add readonly=”true” to the [input id=”mail”…/] field, so the invited users are no longer able to change their email addresses.

Do NOT USE UMLAUTE in invite !!!!!

Connections 6.0 CR 5: ICM (mail plugin) vs. Wikis

We are running Connections 6.0 CR5 with the Mail / calendar Plugin.
The plugin is officially not supported.

One issue we ran into is that the wiki page looses its header as soon as we create or edit a page.
The connections menu would appear below the page. Only a page refresh would bring the menu back to the top.

Analyzing the page source revealed, that the lotusMain was placed before the banner.

To fix this you need to alter the javascript. Always make a backup! Test it first. Use the following on your own risk:
It’s in the file shared/provision/webresources/com.ibm.lconn.wikis.web.resources_[XY].jar
unzip it
edit the file resources\scenes.js at line 1792 replace
frame.insertBefore(el,frame.lastChild);
with
//frame.insertBefore(el,frame.lastChild);
dojo.place(el,frame,'last');

Repack everything to the same filename com.ibm.lconn.wikis.web.resources_[XY].jar and use it to replace the one in the webresources.

As this is a javascript change you need to update the versionstamp, stop your nodes, clean the websphere temp directories and any cacheing proxies, before the change takes effect.
As stated before. use at your own risk.

IBM Connections Systemd on Centos / RHEL 7

Starting Deployment Manager as a SystemD Service.
Save the code below to /etc/systemd/system/dmgr.service


[Unit]
Description="Connections Deployment Manager"
Requires=network.service
After=network-online.target

[Service]
User=connections
ExecStart=/opt/ibm/WebSphere/AppServer/profiles/Dmgr01/bin/startManager.sh
ExecStop=/opt/ibm/WebSphere/AppServer/profiles/Dmgr01/bin/stopManager.sh
PIDFile=/opt/ibm/WebSphere/AppServer/profiles/Dmgr01/logs/dmgr/dmgr.pid
Type=forking
Restart=no
TimeoutStartSec=6000
TimemoutStopSec=600
LimitNOFILE=60000
LimitNPROC=12500

[Install]
WantedBy=multi-user.target

Enable dmgr at boot time:
systemctl enable dmgr.service

Some more examples.
systemctl start dmgr.service
systemctl stop dmgr.service
systemctl status dmgr.service

Connections – Mass file removal

I got a request to cleanup some files in a IBM Connections environment. Remove all the files created before 1.1.2015.
Getting the list of files is easy.

connect to files@
export to files.csv of del modified by nochardel
select hex(ID), id, title,create_date from files.media where create_date < date('2015-01-01')@ connect reset@

This got me around 11'000 records.
In the api documentation I found the required API call.
After a while I got my first node.js app. Which was correctly posting the required requests against my server. Doing a testrun with 11'000 records was a bad idea. My environment was so fast, that it sent the 11'000 requests in under 10s.
The first couple of request passed, but then I got a lot of errors and the files app did not respond at all. With the help of simple rate limiter I throttled it down to 2 requests/per second. The process will now take a bit longer, but at least the connections environment stays alive.

Connections Customizer Lite

As we are currently only using the Customizer from the stack formerly known as pink (sfkap) I was suprised to find the Customizer Lite in the downloads. As my connections 6 lab is running with the sfkap I decided to give my Connections 5.5 Lab an upgrade. The installationexperience was ok. It’s not HA, but that’s ok, as my lab is not HA either.
I only added an nginx reverse proxy on the same box. It should also be possible to run a dockerized nginx if needed, but I’m happy with the nginx on the host.

Applying some basic customizations, adding css or js files , seems to work with Connections 5.5.

In my opinion the effort to setup the customizer lite and use it, compared to fiddle around with hikariTheme/custom.css and some nasty javascripts/dojo plugins, is ok.

Time to play around…

IBM Customizer – Do not reboot the worker node – personal reminder

Rule #1: Drain the worker node before a reboot. 
Consequences if you do not follow the Rule #1: the mongo-db gets corrupt and you need to repair it.
Steps to repair the customizer DB: delete the contents in mongo-node-0, mongo-node-1 and mongo-node-2’s subdirectory /data/db/ and register the customizer apps in the appregistry again.

I’m sure that there’s a way to repair the mongo db directories…