Generative AI confusing term: ‘domain aware generative AI’

What follows is a warning to people looking to buy a new system built with the same technology that powers ChatGPT from a startup which may claim to having something brand new. These companies are taking something you can get today for $360/year/user and charging orders of magnitude more by confusing their customers.

One term that seemed weird to me is term is “domain aware generative AI.” I dug in and discovered that this phrase is just used so that a customer can’t look up the real name: Open Domain Question Answering (ODQA). If you have Microsoft Office with Co-Pilot enabled, you have this today. It’s that thing which happens when you put a bunch of files in SharePoint and then use Co-Pilot to ask questions about the documents in SharePoint, nothing more. One of the companies using the term claims to have the patent on ODQA: https://ppubs.uspto.gov/dirsearch-public/print/downloadPdf/20240062019. Note one can find papers on ODQA which predate this patent application, so I am of the opinion that Microsoft does not need to worry about a patent troll any time soon. One example which predates the patent, which has links to even older articles, is here: https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00530/114590/Improving-the-Domain-Adaptation-of-Retrieval.

What is ODQA? It’s a technique to create answers from a generative AI while significantly reducing the opportunity to hallucinate. ODQA requires a 2 stage pipeline: a retrieval process which selects paragraphs and other text relevant to a given question. This combines with a tool that generates an answer from the selected passages. That 2-stage pipeline assumes a search mechanism of some sort. I’ve built such a system which did the following:

  1. Read the corpus of content and cut things up into paragraphs up to 1K tokens long.
  2. For each token, create an embedding using an algorithm like text-ada-002 from Open AI or another embedding algorithm which understand the language of the document.
  3. Store the embedding alongside the document path and page number in a vector database such as weaviate.

Using that system, one could then take a question like “How does our quality control process provide widgets which satisfy the requirement to <do a task>?” The vector database uses a cosine similarity search to identify the most aligned paragraphs. One then submits those paragraphs to an LLM like Mistral, Llama, GPT-4, etc. and delivers a prompt which states something like

Answer the question 'How does our quality control process provide widgets which satisfy the requirement to <do a task>?' using the information below. The information is formatted as [DOC NAME]; [Pg Number]. In the response, cite the [DOC NAME] and [Pg Number]. If you can't find the answer in the attached content, respond that you do not know.

That prompt will also contain the relevant content, formatted as described. It will respond “I do not know the answer” if no answer can be found.

So now, if you hear the term “Domain Aware Generative AI”, know that this is ODQA and is something you have in SharePoint today with CoPilot.

Leave a comment

Note to self: Management endpoint for Azure Gov Cloud

This is just a note for me and anyone else who may need this.

The Azure Management APIs are all documented to use management.azure.com as a base URL. When you need to access the same URLs over AzureUSGovernment cloud, the base URL changes to

management.usgovcloudapi.net. At least, that change happened for the CognitiveService which I had soft deleted, tried to purge, and had to resort to REST APIs because the gov cloud for CognitiveServices can lag public cloud by many months.

Anyhow, it took me about 30 minutes to find this magic URL, so I’m passing it along to you.

One other helper note: if you need the access token to execute some management APIs from Postman (or similar), you can get the authentication token after a call to az login by using this command:

az account get-access-token

Then, just paste the value from the accessToken field into the Authorization header as:

Bearer <contents of accessToken>

Example to generate the whole string and echo things out correctly via a bash shell:

access_token=$(az account get-access-token --query accessToken -o tsv) && echo "Bearer $access_token"

Leave a comment

Sick of posts saying “capitalist is best”

Really getting tired of the “bought a jet” story that ends with this morale: “Capitalism is freely giving your money in exchange for something of value.

Socialism is taking your money against your will and shoving something down your throat that you never asked for.”

Pure capitalism is profit maximizing and doesn’t look at the societal ills which it causes including exploitation of fellow humans.

Pure socialism levels the playing field too much and takes away the benefits of risk + reward, leading a society to stagnate.

A blend of the many political ideologies makes a society work. I’ll ignore libertarians, anarchists, communism, and others for now, but rest assured they are all needed for balance.

Socialism recognizes that some things should be communally owned goods because the profit isn’t present even though we want these things: public schools, roads, fire departments, police, water treatment, regulation + inspection to maintain actual building, product, home, and workplace safety.

Capitalism provides the incentives to try new things and be rewarded. Our patent system is meant to allow inventors to recoup their investment while recognizing that ideas should go into the public domain eventually.

No one I know wants pure capitalism, socialism, libertarianism, communism, anarchy. Purity of these points of view results in optimal outcomes for a few, bad results for the rest. We do want a blend of ideologies and need to recognize when adding an anarchist, libertarian, socialist or capitalist idea will benefit all of us.

Legalized alcohol balances a number of these ideologies: libertarian policies let me decide how much to drink so long as I don’t hurt others. Socialist policies fund police to watch for when I and my fellow humans use alcohol and then operate heavy machinery (car) while under the influence. That same socialist bit also makes sure that factors of production have a means to be transported and that the facilities which manufacture the alcohol are safe to work in and producing product up to “healthy” standards. Capitalism improves the quality of alcohol and ensures I have a variety of options to choose from.

In the US, I think we should be adding ambulance rides, recurring prescriptions for chronic conditions, to the list of socialist policies for everyone. Likewise, some post-secondary education should be added– I’d like to value a medical doctor higher than a degree in Egyptology since one scratches an itch, the other helps eliminate the issue causing the itch:)

Leave a comment

createUiDefinition.json: Selecting existing resources

I was asked how to use two controls from Create UI definition elements – Azure Managed Applications | Microsoft Docs:

The first control displays a list of resources of a given type and is quite handy. The other calls an Azure REST API and delivers the results. There is no default way to display the results; you need to couple the ArmApiControl with a display if you want to show any values. I wrote the following little createUiDefinition.json file to demonstrate how to use these against something many actively used Azure Subscriptions contains: storage accounts.

All the information is on a blade called bladeData. The first item uses the “easy” path. The second uses the ArmApiControl paired with a dropdown. Both display the list of storage accounts in the customer subscription.

{
    "$schema": "https://schema.management.azure.com/schemas/0.1.2-preview/CreateUIDefinition.MultiVm.json#",
    "handler": "Microsoft.Azure.CreateUIDef",
    "version": "0.1.2-preview",
    "parameters": {
        "basics": [
            
        ],
        "steps": [
            {
                "name": "bladeData",
                "label": "My blade",
                "elements": [
                  {
                    "name": "resourceSelector",
                    "type": "Microsoft.Solutions.ResourceSelector",
                    "label": "Available storage accounts (Resource Selector)",
                    "resourceType": "Microsoft.Storage/storageAccounts",
                    "toolTip": "Select a storage account from the available list.",
                    "options": {
                      "filter": {
                        "subscription": "onBasics",
                        "location": "onBasics"
                      }
                    },
                    "visible": true
                  },
                  {
                    "name": "armApiControl",
                    "type": "Microsoft.Solutions.ArmApiControl",
                    "request": {
                      "method": "GET",
                      "path": "[concat(subscription().id, '/providers/Microsoft.Storage/storageAccounts?api-version=2021-04-01')]"
                    }
                  },
                  {
                    "name": "providerDropDown",
                    "type": "Microsoft.Common.DropDown",
                    "label": "Available storage accounts (Arm API)",
                    "toolTip": "Select a storage account from the available list.",
                    "constraints": {
                      "allowedValues": "[map(steps('bladeData').armApiControl.value, (item) => parse(concat('{\"label\":\"', item.name, '\",\"value\":\"', item.name, '\"}')))]"
                    },
                    "visible": true
                  }
                ]
            }
        ],
        "outputs": {
            "resourceGroup": "[resourceGroup().name]",
            "location": "[resourceGroup().location]"

        }
    }
}

Leave a comment

Showing costs in Azure Managed App createUIDefinition.json

Some Azure Managed Application publishers have ARM templates which cause their ARM template to vary in cost. To help reduce sticker shock, those publishers want some way to show how choices in the deployment will influence the cost to run the application. Getting this value is not trivial. Depending on many factors, the costs will be different for different users. Azure prices change based on a variety of conditions:

  • Commitment to use Azure expressed in a contract with Microsoft will provide different discounts based on the dollar value of that commitment.
  • Customer may purchase Azure through a Cloud Service Provider who discounts Azure usage.
  • Customer may be paying full, advertised price.
  • And so on…

At the time of this writing, no mechanism exists to find the customer prices for resources. That said, you can give a customer an idea of how much one resource type will cost: you can show them the costs for VMs. It is pretty common for an ARM template to create a variable number of virtual machines based on application parameters. In these kinds of templates, the virtual machine count can dominate the costs for running that managed application. Many partners want a mechanism which enables them to show how much those VMs will cost. While you cannot query the value directly, you can display the information to the user via a control in your createUIDefinition.json: use a SizeSelector. How would this work in practice?

Let’s assume something quite basic: a user picks an integer value from a range. In this case, the ARM template will scale the VM count linearly with the integer: one VM for every 2 units. If we have the following Slider declared:

{
    "name": "unitCount",
    "type": "Microsoft.Common.Slider",
    "min": 2,
    "max": 100,
    "label": "Units",
    "subLabel": "",
    "defaultValue": 5,
    "showStepMarkers": false,
    "toolTip": "Pick the number of units",
    "constraints": {
        "required": false
    },
    "visible": true
}

Then we can let the user know how many VMs will be created and display that as a TextBlock where the text value is:

"text": "[concat(string(steps('ApplicationCharacteristics').unitCount.value), ' was selected. ', string(div(steps('ApplicationCharacteristics').unitCount.value, 2)), ' VMs will be created.' )]"

Finally, we can add the SizeSelector to show the cost of each VM which is allowed. Many of the ARM templates with this kind of setup will constrain the selection of VM to 1-4 different sizes. We will do that by limiting the information displayed to the user by using the allowedSizes constraint.

{
    "name": "sizes",
    "type": "Microsoft.Compute.SizeSelector",
    "label": "Size",
    "toolTip": "",
    "recommendedSizes": [
        "Standard_D32_v4",
        "Standard_D48_v4",
        "Standard_D64_v4"
    ],
    "constraints": {
        "allowedSizes": [
            "Standard_D4_v4",
            "Standard_D16_v4",
            "Standard_D32_v4",
            "Standard_D48_v4",
            "Standard_D64_v4"
        ]
    },
    "osPlatform": "Windows",
    
    "count": "[div(steps('ApplicationCharacteristics').unitCount.value, 2)]",
    "visible": true
}

We can now show the user how much each VM will cost. The user still needs to do some math to convert the cost per VM to an overall cost. The SizeSelector will display the user’s costs, and now the user can get an idea for what the VM cost component of the managed application will be.

Please note: when running in the Create UI Definition Sandbox, the Size Selector may return Cost/month of Unavailable in some cases.

To try this on your own, try using the Sandbox and paste in the following createUiDefinition.json:

{
    "$schema": "https://schema.management.azure.com/schemas/0.1.2-preview/CreateUIDefinition.MultiVm.json#",
    "handler": "Microsoft.Azure.CreateUIDef",
    "version": "0.1.2-preview",
    "parameters": {
      "config": {
        "basics": {
          "description": "Sample UI Definition which uses a SizeSelector to estimate price",
          "resourceGroup": {
            "allowExisting": true
          }
        }
      },
      "basics": [],
      "steps": [
        {
          "name": "ApplicationCharacteristics",
          "label": "Pick some characteristics of your application",
          "subLabel": {
            "preValidation": "Select your options",
            "postValidation": "Done"
          },
          "bladeTitle": "Application options",
          "elements": [
            {
              "name": "Infoenvironment",
              "type": "Microsoft.Common.InfoBox",
              "visible": true,
              "options": {
                "icon": "Info",
                "text": "For every 2 units, we add one VM to the deployment."
              }
            },
            {
                "name": "unitCount",
                "type": "Microsoft.Common.Slider",
                "min": 2,
                "max": 100,
                "label": "Units",
                "subLabel": "",
                "defaultValue": 5,
                "showStepMarkers": false,
                "toolTip": "Pick the number of units",
                "constraints": {
                    "required": false
                },
                "visible": true
            },
            {
                "name": "unitDisplay",
                "type": "Microsoft.Common.TextBlock",
                "visible": true,
                "options": {
                  
                  "text": "[concat(string(steps('ApplicationCharacteristics').unitCount.value), ' was selected. ', string(div(steps('ApplicationCharacteristics').unitCount.value, 2)), ' VMs will be created.' )]"
                }
            },
            {
                "name": "sizeReview",
                "type": "Microsoft.Common.TextBlock",
                "visible": true,
                "options": {
                  "text": "Use the size selector to see how much the VMs will cost."
                }
            },
            {
                "name": "sizes",
                "type": "Microsoft.Compute.SizeSelector",
                "label": "Size",
                "toolTip": "",
                "recommendedSizes": [
                    "Standard_D32_v4",
                    "Standard_D48_v4",
                    "Standard_D64_v4"
                ],
                "constraints": {
                    "allowedSizes": [
                        "Standard_D4_v4",
                        "Standard_D16_v4",
                        "Standard_D32_v4",
                        "Standard_D48_v4",
                        "Standard_D64_v4"
                    ]
                },
                "osPlatform": "Windows",
                
                "count": "[div(steps('ApplicationCharacteristics').unitCount.value, 2)]",
                "visible": true
            }
          ]
        }
      ],
      "outputs": {
            "unitCount" : "[steps('ApplicationCharacteristics').unitCount.value]"
      }
    }
  }

Leave a comment

Making a sandbox in Microsoft Partner Center

One of the things those who sell through Microsoft Azure Marketplace, Azure Portal, or AppSource will note is that there isn’t a nice place to test the offer. Many people want a sandbox environment to test away while only incurring charges on their bill for Azure resources without being billed for the offer itself. Today, no sandbox exists. There are valid reasons for this, but this article isn’t going to go into those. Instead, I want to focus on how to get the ability to test away without encountering extra, unexpected charges. I will only focus on the three main transactable types in Partner Center: Software as a Service (SaaS), Azure Application: Managed Application (AMA), and Virtual Machines. The pattern is the same for all three of these.

For every offer you publish in Partner Center, you need a companion test offer. The two offers are identical in all areas except for three:

  • Offer ID
  • Price on plans
  • Co-sell documentation (you don’t bother uploading this for the test offer)

Keep the offer IDs the same across both environments. This allows you to test your logic for provisioning or performing other actions on SaaS and AMA offers. This will also allow you to make sure that a VM which uses the Instance Metadata Service (works for Windows and Linux) to verify the running image has the right logic. It is possible that you’ll have a test case for VMs which requires running under an “unknown” plan name, just to make sure your logic for running only ‘blessed’ images works.

Any pricing on the test plans should be set to $0. In this case, not even a small $0.00001, but $0. I’ve had partners use the $0.00001 meter, then have test code cause a $10000 bill against that meter in a single month. This lets you validate units being reported to the reports in Partner Center without you being asked to pay $10000 so that Microsoft can take their processing fee and deposit the balance back in your bank account.

The test offer will contain the same text, plan IDs, descriptions, contact information, and so on. It will point to the same technical assets. The test environment will copy everything, with a couple of optional tweaks to allow development after your first publication:

  • SaaS: You may choose to point the landing page and web hook to the version of the asset which is in development or test.
  • AMA: Each plan will use the dev version of the zip file for the ARM template, UI Definition, and other assets. The technical configuration will also be updated. Once the test AMA is ready, these assets get copied to the production plans. You may also choose to point the Notification webhook to a test/dev version of the endpoint.
  • Virtual machine: Each VM SKU will get tested. Once all is working, the URLs for the OS disk and any data disks is copied to the production instance.

Finally, when publishing the test offer, never let the offer get past the Preview stage. If you do accidentally publish a preview, test offer, you can stop selling pretty quickly (a stop sell can take up to a few hours).

Once all looks fine, copy the updated technical details to the production offer, review, and publish.

And that is how you give yourself a sandbox environment in Microsoft Partner Center.

Leave a comment

Azure Managed Application: Customize allowed customer actions

When publishing an Azure Managed Application, many ISVs choose to make some functionality available to the owner of the application. You know which of the Azure built-in roles you want to use, but you aren’t sure what actions to include. That built-in roles page also includes the list you need for Allowed control actions and Allowed data actions. Everything under Actions for the role goes into Allowed control actions. Anything under DataActions for the role goes into Allowed data actions. You just need to add the list as a semi-colon delimited list and you are good to go.

If you don’t want to read the docs and you know exactly what you want, you can also pull this information through the az cli or Azure PowerShell. To list all roles, run:

Az cli: az role definition list

PowerShell: Get-AzRoleDefinition

If you already know which role you need details on, you can run another command to get just the specifics for that role. For example, let’s say I know I need the Reader and Data Access role from Storage. I can run:

Az cli: az role definition list --name 'Reader and Data Access'

PowerShell: Get-AzRoleDefinition -Name 'Reader and Data access'

Once you have the specific role, you can then emit the right values for the control actions and data actions. This is fairly easy to do in PowerShell.

$roleDefinition = Get-AzRoleDefinition -Name 'Reader and Data access'

Write-Host "Control actions:" ($roleDefinition.Actions -join ";")

Write-Host "Data actions:" ($roleDefinition.DataActions -join ";")

Leave a comment

ARM Templates: Pass all parameters to a script

I had an interesting inquiry the other day. I was talking to a developer. She had a script which consumed most of the parameters in her ARM template. Every time she added a new parameter, she also had to remember to update the script call to pass the new parameter. She needed a mechanism to just pass all parameters to a bash script.

As a proof of concept, I just wanted to find a way to get the parameters to be saved on a Linux VM. For this, I did some looking at the various ARM template functions and noticed this one: deployment(). deployment() allows access to the ARM template and the parameters which differ in value from the parameter’s defaultValue.

To use the value, I first captured the values of interest in the ARM template variables:

"variables": {
  "uniqueParams": "[deployment().properties.parameters]",
  "defaultParams": "[deployment().properties.template.parameters]",
  "userParams": "[replace(string(variables('uniqueParams')), '\"', '\\\"')]",
  "originalParams": "[replace(string(variables('defaultParams')), '\"', '\\\"')]"
},

Please note that the above variables section was trimmed to just show the capture of the parameters. The string() function turns the object into JSON, replace() does simple string substitution. Once captured, my proof of concept then just wanted to show that the values could be passed along. To do this, I added a custom extension which would emit the data to a well known location.

{
    "type": "Microsoft.Compute/virtualMachines/extensions",
    "apiVersion": "2020-06-01",
    "name": "[concat(variables('vmName'),'/', 'RunScripts')]",
    "location": "[parameters('location')]",
    "dependsOn": [
        "[concat('Microsoft.Compute/virtualMachines/',variables('vmName'))]"
    ],
    "properties": {
        "publisher": "Microsoft.Azure.Extensions",
        "type": "CustomScript",
        "typeHandlerVersion": "2.1",
        "autoUpgradeMinorVersion":true,
        "settings": {
            "commandToExecute": "[concat('echo \"', variables('userParams'), '\" > /var/userParams.txt')]"
        }
    }
}

Once done, the following was emitted to userParams.txt:

{
  "vmNamePrefix": {
    "value": "scseely"
  },
  "vmSize": {
    "value": "Standard_DS2_v2"
  },
  "pwd": {
    "value": "p@ssw0rd"
  },
  "dnsName": {
    "value": "scseely"
  },
  "publicIPAddressName": {
    "value": "sycompscs"
  }
}

Likewise, if you need the default params as well, the file looks like this:

{
  "vmNamePrefix": {
    "type": "String",
    "metadata": {
      "description": "Assign a prefix for the VM name"
    }
  },
  "location": {
    "defaultValue": "[resourceGroup().location]",
    "type": "String",
    "metadata": {
      "description": "Select the Azure region for the resources"
    }
  },
  "vmSize": {
    "type": "String",
    "metadata": {
      "description": "Select the vm size"
    }
  },
  "userName": {
    "defaultValue": "azureadmin",
    "type": "String",
    "metadata": {
      "description": "Specify the OS username"
    }
  },
  "pwd": {
    "type": "SecureString",
    "metadata": {
      "description": "If Windows, specify the password for the OS username"
    }
  },
  "dnsName": {
    "type": "String",
    "metadata": {
      "description": "Specify the DNS name for the managed web app"
    }
  },
  "publicIPAddressName": {
    "type": "String",
    "metadata": {
      "description": "Assign a name for the public IP address"
    }
  }
}

Now, to read the file back, I can use a tool like jq to load and manipulate the JSON file in my scripts. Because the commandToExecute is just a bash command, I can stitch the emission of the JSON with scripts using ‘&&’.

Leave a comment

Connect Application Insights to your Azure Functions App in Terraform

This goes into the “notes for Scott” category, where I post things to my blog for me. I hope this is somewhat useful for you too!

I’m in the process of writing Terraform automation for an Azure Functions application I’ve built. When the deployment completed and I went to the Azure Functions application in the Azure portal (https://portal.azure.com), I got a message stating that Application Insights wasn’t connected to the Functions App:

Application Insights is not configured. Configure Application Insights to capture function logs.

The fix isn’t well documented, yet. After deploying a functions app via the portal, I found the link and it’s pretty simple: Azure Functions uses an app setting named APPINSIGHTS_INSTRUMENTATIONKEY. Just add that with the right value and things work.

To put it all together, you will deploy an app service plan, Application Insights, and an Azure Function App:

resource "azurerm_app_service_plan" "app_service_plan" {
  name                = "${var.base_name}appserv"
  location            = "${azurerm_resource_group.rg.location}"
  resource_group_name = "${azurerm_resource_group.rg.name}"
  kind                = "Linux"
  reserved            = true
  sku {
    tier = "Basic"
    size = "B1"
  }
}
resource "azurerm_application_insights" "ai" {
  name                = "${var.base_name}ai"
  location            = "${azurerm_resource_group.rg.location}"
  resource_group_name = "${azurerm_resource_group.rg.name}"
  application_type    = "web"
}
Once created, the azurerm_application_insights resource has a value called instrumentation_key. Connect that to the APPINSIGHTS_INSTRUMENTATIONKEY app setting in your azurerm_function_app to connect AppInsights to your Azure Functions.
resource "azurerm_function_app" "apis" {
  name                      = "${var.base_name}func"
  location                  = "${azurerm_resource_group.rg.location}"
  resource_group_name       = "${azurerm_resource_group.rg.name}"
  app_service_plan_id       = "${azurerm_app_service_plan.app_service_plan.id}"
  storage_connection_string = "${azurerm_storage_account.az_backend.primary_connection_string}"
  https_only                = true
  version                   = "~2"

  app_settings = {
    APPINSIGHTS_INSTRUMENTATIONKEY = "${azurerm_application_insights.ai.instrumentation_key}"
  }

  site_config {
    cors { 
      allowed_origins = ["https://www.${var.domain_name}"]
    }
  }
}

Upon running this, the error message went away and Azure Functions showed I had connected everything correctly.

Leave a comment

Using AzureAD PowerShell on *nix machines (Mac, Linux)

The Azure Active Directory team has a lot of great command line tooling. This is available in the Azure Cloud Shell (from the portal) as well as via the AzureAD PowerShell package. The .NET Core version of the PowerShell package is still in development, but is available for us to use as needed. I’ve had to show a few folks how to do this on my team, so I’m recording the steps here as my “notes”. Run all these from an elevated PowerShell session (sudo PowerShell).

  1. Add the PowerShell Test Gallery. This gallery is a test site and may go down for any reason. The command to make it available is:

Register-PackageSource -Trusted -ProviderName 'PowerShellGet' -Name 'Posh Test Gallery' -Location https://www.poshtestgallery.com/api/v2/'

  1. Import the package using this command.

Install-Module AzureAD.Standard.Preview
For what it’s worth, I installed PowerShell onto my Ubuntu box using the information over here.

If you need the Azure PowerShell module too, run
Install-Module Az

Leave a comment